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PROSTATE-SPECIFIC GENE, PCGEM1, AND METHODS OF USING PCGEM1 
TO DETECT, TREAT, AND PREVENT PROSTATE CANCER 

CROSS REFERENCE TO RELATED APPLICATIONS 

The present application claims the benefit of United States provisional 
application S.N. 60/126,469, filed March 26, 1999, the entire disclosure of which is 
relied upon and incorporated by reference. 

FIELD OF THE INVENTION 

The present invention relates to nucleic acids that are expressed in prostate 
tissue. More particularly, the present invention relates to the first of a family of novel, 
androgen-regulated, prostate-specific genes, PCGEM1, that is over-expressed in 
prostate cancer, and methods of using the PCGEM1 sequence and fragments thereof 
to measure the hormone responsiveness of prostate cancer cells and to detect, 
diagnose, prevent and treat prostate cancer and other prostate related diseases. 

BACKGROUND 

Prostate cancer is the most common solid tumor in American men (1). The 
wide spectrum of biologic behavior (2) exhibited by prostatic neoplasms poses a 
difficult problem in predicting the clinical course for the individual patient (3, 4). 
Public awareness of prostate specific antigen (PSA) screening efforts has led to an 
increased diagnosis of prostate cancer. The increased diagnosis and greater number of 
patients presenting with prostate cancer has resulted in wider use of radical 
prostatectomy for localized disease (5). Accompanying the rise in surgical 
intervention is the frustrating realization of the inability to predict organ-confined 
disease and clinical outcome for a given patient (5, 6). Traditional prognostic 
markers, such as grade, clinical stage, and pretreatment PSA have limited prognostic 
value for individual men. There is clearly a need to recognize and develop molecular 
and genetic biomarkers to improve prognostication and the management of patients 
with clinically localized prostate cancer. As with other common human neoplasia (7), 
the search for molecular and genetic biomarkers to better define the genesis and 
progression of prostate cancer is the key focus for cancer research investigations 
worldwide. 

The new wave of research addressing molecular genetic alterations in prostate 
cancer is primarily due to increased awareness of this disease and the development of 
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newer molecular technologies. The search for the precursor of prostatic 
adenocarcinoma has focused largely on the spectrum of microscopic changes referred 
to as "prostatic intraepithelial neoplasia" (PIN). Bostwick defines this spectrum as a 
histopathologic continuum that culminates in high grade PIN and early invasive 
cancer (8). The morphologic and molecular changes include the progressive 
disruption of the basal cell-layer, changes in the expression of differentiation markers 
of the prostatic secretory epithelial cells, nuclear and nucleolar abnormalities, 
increased cell proliferation, DNA content alterations, and chromosomal and allelic 
losses (8, 9). These molecular and genetic biomarkers, particularly their progressive 
gain or loss, can be followed to trace the etiology of prostate carcinogenesis. 
Foremost among these biomarkers would be the molecular and genetic markers 
associated with histological phenotypes in transition between normal prostatic 
epithelium and cancer. Most studies so far seem to agree that PIN and prostatic 
adenocarcinoma cells have a lot in common with each other. The invasive carcinoma 
more often reflects a magnification of some of the events already manifest in PIN. 

Early detection of prostate cancer is possible today because of the widely 
propagated and recommended blood PSA test that provides a warning signal for 
prostate cancer if high levels of serum PSA are detected. However, when used alone, 
PSA is not sufficiently sensitive or specific to be considered an ideal tool for the early 
detection or staging of prostate cancer (10). Combining PSA levels with clinical 
staging and Gleason scores is more predictive of the pathological stage of localized 
prostate cancer (1 1). In addition, new molecular techniques are being used for 
improved molecular staging of prostate cancer (12, 13). For instance, reverse 
transcriptase - polymerase chain reaction (RT-PCR) can measure PSA of circulating 
prostate cells in blood and bone marrow of prostate cancer patients. 

Despite new molecular techniques, however, as many as 25 percent of men 
with prostate cancer will have normal PSA levels - usually defined as those equal to 
or below 4 nanograms per milliliter of blood (14). In addition, more than 50 percent 
of the men with higher PSA levels are actually cancer free (14). Thus, PSA is not an 
ideal screening tool for prostate cancer. More reliable tumor-specific biomarkers are 
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needed that can distinguish between normal and hyperplastic epithelium, and the 
preneoplastic and neoplastic stages of prostate cancer. 

Identification and characterization of genetic alterations defining prostate 
cancer onset and progression is important in understanding the biology and clinical 
course of the disease. The currently available TNM staging system assigns the 
original primary tumor (T) to one of four stages (14). The first stage, Tl , indicates 
that the tumor is microscopic and cannot be felt on rectal examination. T2 refers to 
tumors that are palpable but fully contained within the prostate gland. A T3 
designation indicates the cancer has spread beyond the prostate into surrounding 
connective tissue or has invaded the neighboring seminal vesicles. T4 cancer has 
spread even further. The TNM staging system also assesses whether the cancer has 
metastasized to the pelvic lymph nodes (N) or beyond (M). Metastatic tumors result 
when cancer cells break away from the original tumor, circulate through the blood or 
lymph, and proliferate at distant sites in the body. 

Recent studies of metastatic prostate cancer have shown a significant 
heterogeneity of allelic losses of different chromosome regions between multiple 
cancer foci (21-23). These studies have also documented that the metastatic lesion 
can arise from cancer foci other than dominant tumors (22). Therefore, it is critical to 
understand the molecular changes which define the prostate cancer metastasis 
especially when prostate cancer is increasingly detected in early stages (15-21). 

Moreover, the multifocal nature of prostate cancer needs to be considered (22- 
23) when analyzing biomarkers that may have potential to predict tumor progression 
or metastasis. Approximately 50-60% of patients treated with radical prostatectomy 
for localized prostate carcinomas are found to have microscopic disease that is not 
organ confined, and a significant portion of these patients relapse (24). Utilizing 
biostatistical modeling of traditional and genetic biomarkers such as p53 and 6c/-2, 
Bauer et al. (25-26) were able to identify patients at risk of cancer recurrence after 
surgery. Thus, there is clearly a need to develop biomarkers defining various stages 
of the prostate cancer progression. 

Another significant aspect of prostate cancer is the key role that androgens 
play in the development of both the normal prostate and prostate cancer. Androgen 
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ablation, also referred to as "hormonal therapy/' is a common treatment for prostate 
cancer, particularly in patients with metastatic disease (14). Hormonal therapy aims 
to inhibit the body from making androgens or to block the activity of androgen. One 
way to block androgen activity involves blocking the androgen receptor; however, 
that blockage is often only successful initially. For example, 70-80% of patients with 
advanced disease exhibit an initial subjective response to hormonal therapy, but most 
tumors progress to an androgen-independent state within two years (16). One 
mechanism proposed for the progression to an androgen-independent state involves 
constitutive activation of the androgen signaling pathway, which could arise from 
structural changes in the androgen receptor protein (16). 

As indicated above, the genesis and progression of cancer cells involve 
multiple genetic alterations as well as a complex interaction of several gene products. 
Thus, various strategies are required to fully understand the molecular genetic 
alterations in a specific type of cancer. In the past, most molecular biology studies 
had focused on mutations of cellular proto-oncogenes and tumor suppressor genes 
(TSGs) associated with prostate cancer (7). Recently, however, there has been an 
increasing shift toward the analysis of "expression genetics" in human cancer (27-31), 
i.e., the under-expression or over-expression of cancer-specific genes. This shift 
addresses limitations of the previous approaches including: 1) labor intensive 
technology involved in identifying mutated genes that are associated with human 
cancer; 2) the limitations of experimental models with a bias toward identification of 
only certain classes of genes, e.g., identification of mutant ras genes by transfection of 
human tumor DNAs utilizing NIH3T3 cells; and 3) the recognition that the human 
cancer associated genes identified so far do not account for the diversity of cancer 
phenotypes. 

A number of studies are now addressing the alterations of prostate cancer- 
associated gene expression in patient specimens (32-36). It is inevitable that more 
reports on these lines are to follow. 

Thus, despite the growing body of knowledge regarding prostate cancer, there 
is still a need in the art to uncover the identity and function of the genes involved in 
prostate cancer pathogenesis. There is also a need for reagents and assays to 
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accurately detect cancerous cells, to define various stages of prostate cancer 
progression, to identify and characterize genetic alterations defining prostate cancer 
onset and progression, to detect micro-metastasis of prostate cancer, and to treat and 
prevent prostate cancer. 



SUMMARY OF THE INVENTION 

The present invention relates to the identification and characterization of a 
novel gene, the first of a family of genes, designated PCGEM1, for Prostate Cancer 
Gene Expression Marker 1 . PCGEM1 is specific to prostate tissue, is androgen- 
regulated, and appears to be over-expressed in prostate cancer. More recent studies 
associate PCGEM1 cDNA with promoting cell growth. The invention provides the 
isolated nucleotide sequence of PCGEM1 or fragments thereof and nucleic acid 
sequences that hybridize to PCGEM1 . These sequences have utility, for example, as 
markers of prostate cancer and other prostate related diseases, and as targets for 
therapeutic intervention in prostate cancer and other prostate related diseases. The 
invention further provides a vector that directs the expression of PCGEM1, and a host 
cell transfected or transduced with this vector. 

In another embodiment, the invention provides a method of detecting prostate 
cancer cells in a biological sample, for example, by using nucleic acid amplification 
techniques with primers and probes selected to bind specifically to the PCGEM1 
sequence. The invention further comprises a method of selectively killing a prostate 
cancer cell, a method of identifying an androgen responsive cell line, and a method of 
measuring responsiveness of a cell line to hormone-ablation therapy. 

In another aspect, the invention relates to an isolated polypeptide encoded by 
the PCGEM1 gene or a fragment thereof, and antibodies generated against the 
PCGEM1 polypeptide, peptides, or portions thereof, which can be used to detect, 
treat, and prevent prostate cancer. 

Additional features and advantages of the invention will be set forth in the 
description which follows, and in part will be apparent from the description, or may 
be learned by practice of the invention. The objectives and other advantages of the 
invention will be realized and attained by the sequences, cells, vectors, and methods 
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particularly pointed out in the written description and claims herein as well as the 
appended drawings. 



BRIEF DE SCRIPTION OF THE DRAWINGS 

Figure 1 depicts the scheme for the identification of differentially expressed 
genes in prostate tumor and normal tissues. 

Figure 2 depicts a differential display pattern of mRNA obtained from 
matched tumor and normal tissues of a prostate cancer patient. Arrows indicate 
differentially expressed cDNAs. 

Figure 3 depicts the analysis of PCGEM1 expression in primary prostate 
cancers. 

Figure 4 depicts the expression pattern of PCGEM1 in prostate cancer cell 

lines. 

Figure 5a depicts the androgen regulation of PCGEM1 expression in LNCaP 
cells, as measured by reverse transcriptase PCR. 

Figure 5b depicts the androgen regulation of PCGEM1 expression in LNCaP 
cells, as measured by Northern blot hybridization. 

Figure 6a depicts the prostate tissue specific expression pattern of PCGEM1. 

Figure 6b depicts a RNA master blot showing the prostate tissue specificity of 
PCGEM1. 

Figure 7A depicts the chromosomal localization of PCGEM1 by fluorescent in 
situ hybridization analysis. 

Figure 7B depicts a DAPI counter-stained chromosome 2 (left), an inverted 
DAPI stained chromosome 2 shown as G-bands (center), and an ideogram of 
chromosome 2 showing the localization of the signal to band 2q32(bar). 

Figure 8 depicts a cDNA sequence of PCGEM1 (SEQ ID NO:l). 

Figure 9 depicts an additional cDNA sequence of PCGEM1 (SEQ ID NO:2). 

Figure 10 depicts the colony formation of NIH3T3 cell lines expressing 
various PCGEM1 constructs. 

Figure 1 1 depicts the cDNA sequence of the promoter region of PCGEM1 
SEQIDNO:3. 
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Figure 12 depicts the cDNA of a probe, designated SEQ ID NO:4. 
Figure 13 depicts the cDNAs of primers 1-3, designated SEQ ID NOs:5-7, 
respectively. 

Figure 14 depicts the genomic DNA sequence of PCGEM1, designated SEQ 
IDNO:8. 

Figure 15 depicts the structure of the PCGEM1 transcription unit. 
Figure 16 depicts a graph of the hypothetical coding capacity of PCGEM1. 
Figure 17 depicts a representative example of in situ hybridization results 
showing PCGEM1 expression in normal and tumor areas of prostate cancer tissues. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to PCGEM1, the first of a family of genes, and 
its related nucleic acids, proteins, antigens, and antibodies for use in the detection, 
prevention, and treatment of prostate cancer (e.g., prostatic intraepithelial neoplasia 
(PIN), adenocarcinomas, nodular hyperplasia, and large duct carcinomas) and prostate 
related diseases (e.g., benign prostatic hyperplasia), and kits comprising these 
reagents. 

Although we do not wish to be limited by any theory or hypothesis, 
preliminary data suggest that the PCGEM1 nucleotide sequence may be related to a 
family of non-coding poly A+RNA that may be implicated in processes relating to 
growth and embryonic development (40-44). Evidence presented herein supports this 
hypothesis. Alternatively, PCGEM1 cDNA may encode a small peptide. 

NUCLEIC ACIP MOLECULES 

In a particular embodiment, the invention relates to certain isolated nucleotide 
sequences that are substantially free from contaminating endogenous material. A 
"nucleotide sequence" refers to a polynucleotide molecule in the form of a separate 
fragment or as a component of a larger nucleic acid construct. The nucleic acid 
molecule has been derived from DNA or RNA isolated at least once in substantially 
pure form and in a quantity or concentration enabling identification, manipulation, 
and recovery of its component nucleotide sequences by standard biochemical methods 
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(such as those outlined in Sambrook et al, Molecular Cloning: A Laboratory Manual, 
2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989)). 

Nucleic acid molecules of the invention include DNA in both single-stranded 
and double-stranded form, as well as the RNA complement thereof. DNA includes, 
for example, cDNA, genomic DNA, chemically synthesized DNA, DNA amplified by 
PCR, and combinations thereof. Genomic DNA may be isolated by conventional 
techniques, e.g. , using the cDN A of SEQ ID NO: 1 , SEQ ID NO:2, or suitable 
fragments thereof, as a probe. 

The DNA molecules of the invention include full length genes as well as 
polynucleotides and fragments thereof. The full length gene may include the N- 
terminal signal peptide. Although a non-coding role of PCGEM1 appears likely, the 
possibility of a protein product cannot presently be ruled out. Therefore, other 
embodiments may include DNA encoding a soluble form, e.g., encoding the 
extracellular domain of the protein, either with or without the signal peptide. 

The nucleic acids of the invention are preferentially derived from human 
sources, but the invention includes those derived from non-human species, as well. 

Preferred Sequences 

Particularly preferred nucleotide sequences of the invention are SEQ ID NO:l, 
SEQ ID NO:2, and SEQ ID NO: 8, as set forth in Figures 8, 9, and 14, respectively. 
Two cDNA clones having the nucleotide sequences of SEQ ID NO:l and SEQ ID 
NO:2, and the genomic DNA having the nucleotide sequence of SEQ ID NO: 8, were 
isolated as described in Example 2. 

Thus, in a particular embodiment, this invention provides an isolated nucleic 
acid molecule selected from the group consisting of (a) the polynucleotide sequence 
of SEQ ID NO:l, SEQ ID NO:2, or SEQ ID NO: 8; (b) an isolated nucleic acid 
molecule that hybridizes to either strand of a denatured, double-stranded DNA 
comprising the nucleic acid sequence of (a) under conditions of moderate stringency 
in 50% formamide and about 6X SSC at about 42°C with washing conditions of 
approximately 60°C, about 0.5X SSC, and about 0.1% SDS; (c) an isolated nucleic 
acid molecule that hybridizes to either strand of a denatured, double-stranded DNA 
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comprising the nucleic acid sequence of (a) under conditions of high stringency in 
50% formamide and about 6X SSC, with washing conditions of approximately 68°C, 
about 0.2X SSC, and about 0.1% SDS; (d) an isolated nucleic acid molecule derived 
by in vitro mutagenesis from SEQ ID NO:l, SEQ ID NO:2, or SEQ ID NO:8; (e) an 
isolated nucleic acid molecule degenerate from SEQ ID NO:l, SEQ ID NO:2, or SEQ 
ID NO: 8 as a result of the genetic code; and (f) an isolated nucleic acid molecule 
selected from the group consisting of human PCGEM1 DNA, an allelic variant of 
human PCGEM1 DNA, and a species homolog of PCGEM1 DNA. 

As used herein, conditions of moderate stringency can be readily determined 
by those having ordinary skill in the art based on, for example, the length of the DNA. 
The basic conditions are set forth by Sambrook et al. Molecular Cloning: A 
Laboratory Manual, 2d ed. Vol. 1, pp. 1.101-104, Cold Spring Harbor Laboratory 
Press, (1989), and include use of a prewashing solution for the nitrocellulose filters of 
about 5X SSC, about 0.5% SDS, and about 1.0 mM EDTA (pH 8.0), hybridization 
conditions of about 50% formamide, about 6X SSC at about 42°C (or other similar 
hybridization solution, such as Stark's solution, in about 50% formamide at about 
42°C), and washing conditions of about 60°C, about 0.5X SSC, and about 0.1% SDS. 
Conditions of high stringency can also be readily determined by the skilled artisan 
based on, for example, the length of the DNA. Generally, such conditions are defined 
as hybridization conditions as above, and with washing at approximately 68°C, about 
0.2X SSC, and about 0.1% SDS. The skilled artisan will recognize that the 
temperature and wash solution salt concentration can be adjusted as necessary 
according to factors such as the length of the probe. 

Additional Sequences 

Due to the known degeneracy of the genetic code, wherein more than one 
codon can encode the same amino acid, a DNA sequence can vary from that shown in 
SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO:8, and still encode PCGEM1 . Such 
variant DNA sequences can result from silent mutations (e.g., occurring during PCR 
amplification), or can be the product of deliberate mutagenesis of a native sequence. 
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The invention thus provides isolated DNA sequences of the invention selected 
from: (a) DNA comprising the nucleotide sequence of SEQ ID NO:l, SEQ ID NO:2, 
or SEQ ID NO:8; (b) DNA capable of hybridization to a DNA of (a) under conditions 
of moderate stringency; (c) DNA capable of hybridization to a DNA of (a) under 
conditions of high stringency; and (d) DNA which is degenerate as a result of the 
genetic code to a DNA defined in (a), (b), or (c). Such sequences are preferably 
provided and/or constructed in the form of an open reading frame uninterrupted by 
internal non-translated sequences, or introns, that are typically present in eukaryotic 
genes. Sequences of non-translated DNA can be present 5' or 3' from an open 
reading frame, where the same do not interfere with manipulation or expression of the 
coding region. Of course, should PCGEM1 encode a polypeptide, polypeptides 
encoded by such DNA sequences are encompassed by the invention. Conditions of 
moderate and high stringency are described above. 

In another embodiment, the nucleic acid molecules of the invention comprise 
nucleotide sequences that are at least 80% identical to a nucleotide sequence set forth 
herein. Also contemplated are embodiments in which a nucleic acid molecule 
comprises a sequence that is at least 90% identical, at least 95% identical, at least 98% 
identical, at least 99% identical, or at least 99.9% identical to a nucleotide sequence 
set forth herein. 

Percent identity may be determined by visual inspection and mathematical 
calculation. Alternatively, percent identity of two nucleic acid sequences may be 
determined by comparing sequence information using the GAP computer program, 
version 6.0 described by Devereux et al. (Nucl. Acids Res. 12:387, 1984) and available 
from the University of Wisconsin Genetics Computer Group (UWGCG). The 
preferred default parameters for the GAP program include: (1 ) a unary comparison 
matrix (containing a value of 1 for identities and 0 for non-identities) for nucleotides, 
and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 
14:6745, 1986, as described by Schwartz and Dayhoff, eds., Atlas of Protein Sequence 
and Structure, National Biomedical Research Foundation, pp. 353-358, 1979; (2) a 
penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each 
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gap; and (3) no penalty for end gaps. Other programs used by one skilled in the art of 
sequence comparison may also be used. 

The invention also provides isolated nucleic acids useful in the production of 
polypeptides. Such polypeptides may be prepared by any of a number of conventional 
techniques. A DNA sequence of this invention or desired fragment thereof may be 
subcloned into an expression vector for production of the polypeptide or fragment. 
The DNA sequence advantageously is fused to a sequence encoding a suitable leader 
or signal peptide. Alternatively, the desired fragment may be chemically synthesized 
using known techniques. DNA fragments also may be produced by restriction 
endonuclease digestion of a full length cloned DNA sequence, and isolated by 
electrophoresis on agarose gels. If necessary, oligonucleotides that reconstruct the 5' 
or 3 ' terminus to a desired point may be ligated to a DNA fragment generated by 
restriction enzyme digestion. Such oligonucleotides may additionally contain a 
restriction endonuclease cleavage site upstream of the desired coding sequence, and 
position an initiation codon (ATG) at the N-terminus of the coding sequence. 

The well-known polymerase chain reaction (PCR) procedure also may be 
employed to isolate and amplify a DNA sequence encoding a desired protein 
fragment. Oligonucleotides that define the desired termini of the DNA fragment are 
employed as 5' and 3' primers. The oligonucleotides may additionally contain 
recognition sites for restriction endonucleases, to facilitate insertion of the amplified 
DNA fragment into an expression vector. PCR techniques are described in Saiki et 
al., Science 239:487 (1 988); Recombinant DNA Methodology, Wu et al., eds., 
Academic Press, Inc., San Diego (1989), pp. 1 89-196; and PCR Protocols: A Guide 
to Methods and Applications, Innis et al., eds., Academic Press, Inc. (1990), 

USE OF PCGEM1 NUCLEIC ACID OR OLIGONUCLEOTIDES 

In a particular embodiment, the invention relates to PCGEM1 nucleotide 
sequences isolated from human prostate cells, including the complete genomic DNA 
(Figure 14, SEQ ID NO: 8), and two full length cDNAs: SEQ ID NO:l (Figure 8) and 
SEQ ID NO:2 (Figure 9), and fragments thereof. The nucleic acids of the invention, 
including DNA, RNA, mRNA and oligonucleotides thereof, are useful in a variety of 
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applications in the detection, diagnosis, prognosis, and treatment of prostate cancer. 
Examples of applications within the scope of the present invention include, but are not 
limited to: 

amplifying PCGEM1 sequences; 

detecting a PCGEM1 -derived marker of prostate cancer by 

hybridization with an oligonucleotide probe; 

identifying chromosome 2; 

mapping genes to chromosome 2; 

identifying genes associated with certain diseases, syndromes, or other 

conditions associated with human chromosome 2; 

constructing vectors having PCGEM 1 sequences; 

expressing vector-associated PCGEM1 sequences as RNA and protein; 

detecting defective genes in an individual; 

developing gene therapy; 

developing immunologic reagents corresponding to PCGEM 1 -encoded 
products; and 

treating prostate cancer using antibodies, antisense nucleic acids, or 
other inhibitors specific for PCGEM 1 sequences. 

Detecting. Diagnosin g, and Treating Prostate f!ai? r.< »r 
The present invention provides a method of detecting prostate cancer in a 
patient, which comprises (a) detecting PCGEM1 mRNA in a biological sample from 
the patient; and (b) correlating the amount of PCGEM1 mRNA in the sample with the 
presence of prostate cancer in the patient. Detecting PCGEM1 mRNA in a biological 
sample may include: (a) isolating RNA from said biological sample; (b) amplifying a 
PCGEM1 cDNA molecule; (c) incubating the PCGEM1 cDNA with the isolated 
nucleic acid of the invention; and (d) detecting hybridization between the PCGEM 1 
cDNA and the isolated nucleic acid. The biological sample can be selected from the 
group consisting of blood, urine, and tissue, for example, from a biopsy. In a 
preferred embodiment, the biological sample is blood. This method is useful in both 
the initial diagnosis of prostate cancer, and the later prognosis of disease. This 
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method allows for testing prostate tissue in a biopsy, and after removal of a cancerous 
prostate, continued monitoring of the blood for micrometastases. 

According to this method of diagnosing and prognosticating prostate cancer in 
a patient, the amount of PCGEM1 mRNA in a biological sample from a patient is 
correlated with the presence of prostate cancer in the patient. Those of ordinary skill 
in the art can readily assess the level of over-expression that is correlated with the 
presence of prostate cancer. 

In another embodiment, this invention provides a vector, comprising a 
PCGEM1 promoter sequence operatively linked to a nucleotide sequence encoding a 
cytotoxic protein. The invention further provides a method of selectively killing a 
prostate cancer cell, which comprises introducing the vector to prostate cancer cells 
under conditions sufficient to permit selective killing of the prostate cells. As used 
herein, the phrase "selective killing" is meant to include the killing of at least a cell 
which is specifically targeted by a nucleotide sequence. The putative PCGEM1 
promoter, contained in the 5' flanking region of the PCGEM1 genomic sequence, 
SEQ ID NO: 3, is set forth in Figure 1 1 . Applicants envision that a nucleotide 
sequence encoding any cytotoxic protein can be incorporated into this vector for 
delivery to prostate tissue. For example, the cytotoxic protein can be ricin, abrin, 
diphtheria toxin, p53, thymidine kinase, tumor necrosis factor, cholera toxin, 
Pseudomonas aeruginosa exotoxin A, ribosomal inactivating proteins, or mycotoxins 
such as trichothecenes, and derivatives and fragments (e.g., single chains) thereof. 

This invention also provides a method of identifying an androgen-responsive 
cell line, which comprises (a) obtaining a cell line suspected of being androgen- 
responsive, (b) incubating the cell line with an androgen; and (c) detecting PCGEM1 
mRNA in the cell line, wherein an increase in PCGEM1 mRNA, as compared to an 
untreated cell line, correlates with the cell line being androgen-responsive. 

The invention further provides a method of measuring the responsiveness of a 
prostatic tissue to hormone-ablation therapy, which comprises (a) treating the 
prostatic tissue with hormone-ablation therapy; and (b) measuring PCGEM1 mRNA 
in the prostatic tissue following hormone-ablation therapy, wherein a decrease in 
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PCGEM1 mRNA, as compared to an untreated cell line, correlates with the cell line 
responding to hormone-ablation therapy. 

In another aspect of the invention, these nucleic acid molecules may be 
introduced into a recombinant vector, such as a plasmid, cosmid, or virus, which can 
be used to transfect or transduce a host cell. The nucleic acids of the present invention 
may be combined with other DNA sequences, such as promoters, polyadenylation 
signals, restriction enzyme sites, multiple cloning sites, and other coding sequences. 

Probes 

Among the uses of nucleic acids of the invention is the use of fragments as 
probes or primers. Such fragments generally comprise at least about 1 7 contiguous 
nucleotides of a DNA sequence. The fragment may have fewer than 17 nucleotides, 
such as, for example, 1 0 or 1 5 nucleotides. In other embodiments, a DNA fragment 
comprises at least 20, at least 30, or at least 60 contiguous nucleotides of a DNA 
sequence. Examples of probes or primers of the invention include those of SEQ ID 
NO: 5, SEQ ID NO: 6, and SEQ ID NO: 7, as well as those disclosed in Table I. 
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Table I 



rnnici 






Starting 
Base n 


cirri in itin 


p413 


TGGCAACAGGCAAGCAGAG 


s 


510 


SEO ID NO: 9 


p414 


GGCCAAAATAAAACCAAACAT 


AS 


610 


SEO ID NO: 10 




GCAAATATGATTTAAAGATACAAC 


c 


/ j a. 


^FO TD NO 1 1 

uCy IJJ INVy. 1 J 


p490 


GGTTGTATCTTTAAATCATATTTGC 


AS 


776 


SEO ID NO- 12 


p491 


ACTGTCTTTTCATATATTTCTCAATGC 


s 


559 


SEO ID NO- 13 


p517 


AAGTAGTAATTTTAAACATGGGAC 


AS 


1516 


SEQ ID NO: 14 


p518 


rrrnCAATTAGGCAGCAACC 


S 


131 


SEQIDNO: 15 


p519 


GAATTGTCTTTGTGATTG 1 111 1 AG 


S 


1338 


SEQ ID NO: 16 


p560 


CAATTCACAAAGACAATTCAGTTAAG 


AS 


1355 


SEQIDNO: 17 


p561 


ACAATTAGACAATGTCCAGCTGA 


AS 


1154 


SEQIDNO: 18 


p562 


CTTTGGCTGATATCATGAAGTGTC 


AS 


322 


SEQIDNO: 19 


p623 


AACCTTTTGCCCTATGCCGTAAC 


S 


148 


SEQIDNO: 20 


p624 


GAGACTCCCAACCTGATGATGT 


AS 


376 


SEQ ID NO: 21 


p839 


GGTCACGTTGAGTCCCAGTG 


AS 


270 


SEQ ID NO: 22 



S/AS indicates whether the primer is Sense or AntiSense 

Starting Base # indicates the starting base number with respect to the sequence of 

SEQIDNO:!. 

However, even larger probes may be used. For example, a particularly preferred 
probe is derived from PCGEM1 (SEQ ID NO: 1) and comprises nucleotides 1 16 to 
1 140 of that sequence. It has been designated SEQ ID NO: 4 and is set forth in Figure 
12. 

When a hybridization probe binds to a target sequence, it forms a duplex 
molecule that is both stable and selective. These nucleic acid molecules may be 
readily prepared, for example, by chemical synthesis or by recombinant techniques. A 
wide variety of methods are known in the art for detecting hybridization, including 
fluorescent, radioactive, or enzymatic means, or other ligands such as avidin/biotin. 

In another aspect of the invention, these nucleic acid molecules may be 
introduced into a recombinant vector, such as a plasmid, cosmid, or virus, which can 
be used to transfect or transduce a host cell. The nucleic acids of the present invention 
may be combined with other DNA sequences, such as promoters, polyadenylation 
signals, restriction enzyme sites, multiple cloning sites, and other coding sequences. 
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Because homologs of SEQ ID NO: 1 , SEQ ID NO: 2, and SEQ ID NO: 8 from 
other mammalian species are contemplated herein, probes based on the human DNA 
sequence of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 8 may be used to screen 
cDNA libraries derived from other mammalian species, using conventional cross- 
species hybridization techniques. 

In another aspect of the invention, one can use the knowledge of the genetic 
code in combination with the sequences set forth herein to prepare sets of degenerate 
oligonucleotides. Such oligonucleotides are useful as primers, e.g., in polymerase 
chain reactions (PCR), whereby DNA fragments are isolated and amplified. 
Particularly preferred primers are set forth in Figures 13 and Table I and are 
designated SEQ ID NOS: 5-7 and 9-22, respectively. A particularly preferred primer 
pair is p5 1 8 (SEQ ID NO: 1 5) and p839 (SEQ ID NO: 22), which when used in PCR, 
preferentially amplifies mRNA, thereby avoiding less desirable cross-reactivity with 
genomic DNA. 



Chromosome Mapping 

As set forth in Example 3, the PCGEM1 gene has been mapped by fluorescent 
in situ hybridization to the 2q32 region of chromosome 2 using a bacterial artificial 
chromosome (BAC) clone containing PCGEU1 genomic sequence. Thus, all or a 
portion of the nucleic acid molecule of SEQ ID NO:l, SEQ ID NO:2, and SEQ ID 
NO:8, including oligonucleotides, can be used by those skilled in the art using well- 
known techniques to identify human chromosome 2, and the specific locus thereof, 
that contains the PCGEM1 DNA. Useful techniques include, but are not limited to, 
using the nucleotide sequence of SEQ ID NO:l, SEQ ID NO:2, or SE ID NO:8, or 
fragments thereof, including oligonucleotides, as a probe in various well-known 
techniques such as radiation hybrid mapping (high resolution), in situ hybridization to 
chromosome spreads (moderate resolution), and Southern blot hybridization to hybrid 
cell lines containing individual human chromosomes (low resolution). 

For example, chromosomes can be mapped by radiation hybridization. First, 
PCR is performed using the Whitehead Institute/MIT Center for Genome Research 
Genebridge4 panel of 93 radiation hybrids 
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flittp://wwW"genome.wi.mit.edu/ftp/distribution/ 

human_STS_releases/july97/rhmap/genebridge4.html). Primers are used which lie 
within a putative exon of the gene of interest and which amplify a product from 
human genomic DNA, but do not amplify hamster genomic DNA. The results of the 
PCRs are converted into a data vector that is submitted to the Whitehead/MIT 
Radiation Mapping site on the internet (http://www-seq.wi.mit.edu). The data is 
scored and the chromosomal assignment and placement relative to known Sequence 
Tag Site (STS) markers on the radiation hybrid map is provided. (The following web 
site provides additional information about radiation hybrid mapping: 
hnp://www-genome.wi.mit.edu/ftp/distribution/human_STS_releases/july97/ 
07-97.INTRO.html). 

Identifying Associated Diseases 

As noted above, PCGEM1 has been mapped to the 2q32 region of 
chromosome 2. This region is associated with specific diseases, which include but are 
not limited to diabetes mellitus (insulin dependent), and T cell leukemia/lymphoma. 
Thus, the nucleic acids of SEQ ID NO: 1 , SEQ ID NO: 2, or SEQ ID NO:8, or 
fragments thereof, can be used by one skilled in the art using well-known techniques 
to analyze abnormalities associated with gene mapping to chromosome 2. This 
enables one to distinguish conditions in which this marker is rearranged or deleted. In 
addition, nucleotides of SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO:8, or fragments 
thereof, can be used as a positional marker to map other genes of unknown location. 

The DNA may be used in developing treatments for any disorder mediated 
(directly or indirectly) by defective, or insufficient amounts of PCGEM1, including 
prostate cancer. Disclosure herein of native nucleotide sequences permits the 
detection of defective genes, and the replacement thereof with normal genes. 
Defective genes may be detected in in vitro diagnostic assays, and by comparison of a 
native nucleotide sequence disclosed herein with that of a gene derived from a person 
suspected of harboring a defect in this gene. 
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Sense- Anti sense 

Other useful fragments of the nucleic acids include antisense or sense 
oligonucleotides comprising a single-stranded nucleic acid sequence (either RNA or 
DNA) capable of binding to target mRNA (sense) or DNA (antisense) sequences. 
Antisense or sense oligonucleotides, according to the present invention, comprise a 
fragment of DNA (SEQ ID NO:l, SEQ ID NO:2, or SEQ ID NO:8). Such a fragment 
generally comprises at least about 14 nucleotides, preferably from about 14 to about 
30 nucleotides. The ability to derive an antisense or a sense oligonucleotide, based 
upon a cDNA sequence encoding a given protein is described in, for example, Stein 
and Cohen (Cancer Res. 48:2659, 1988) and van der Krol et al. (BioTechniques 6:958, 
1988). 

The biologic activity of PCGEM1 in assay cells and the over expression of 
PCGEM1 in prostate cancer tissues suggest that elevated levels of PCGEM1 promote 
prostate cancer cell growth. Thus, the antisense oligonucleotides to PCGEM1 may be 
used to reduce the expression of PCGEM1 and, consequently, inhibit the growth of 
the cancer cells. 

Binding of antisense or sense oligonucleotides to target nucleic acid sequences 
results in the formation of duplexes. The antisense oligonucleotides thus may be used 
to block expression of proteins or to inhibit the function of RNA. Antisense or sense 
oligonucleotides further comprise oligonucleotides having modified sugar- 
phosphodiester backbones (or other sugar linkages, such as those described in 
WO91/06629) and wherein such sugar linkages are resistant to endogenous nucleases. 
Such oligonucleotides with resistant sugar linkages are stable in vivo (i.e., capable of 
resisting enzymatic degradation) but retain sequence specificity to be able to bind to 
target nucleotide sequences. 

Other examples of sense or antisense oligonucleotides include those 
oligonucleotides which are covalently linked to organic moieties, such as those 
described in WO 90/10448, and other moieties that increases affinity of the 
oligonucleotide for a target nucleic acid sequence, such as poly-(L-lysine). Further 
still, intercalating agents, such as ellipticine, and alkylating agents or metal complexes 
may be attached to sense or antisense oligonucleotides. Such modifications may 
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modify binding specificities of the antisense or sense oligonucleotide for the target 
nucleotide sequence. 

Antisense or sense oligonucleotides may be introduced into a cell containing 
the target nucleic acid sequence by any gene transfer method, including, for example, 
lipofection, CaP0 4 -mediated DNA transfection, electroporation, or by using gene 
transfer vectors such as Epstein-Ban- virus or adenovirus. 

Sense or antisense oligonucleotides also may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand 
binding molecule, as described in WO 91/04753. Suitable ligand binding molecules 
include, but are not limited to, cell surface receptors, growth factors, other cytokines, 
or other ligands that bind to cell surface receptors. Preferably, conjugation of the 
ligand binding molecule does not substantially interfere with the ability of the ligand 
binding molecule to bind to its corresponding molecule or receptor, or block entry of 
the sense or antisense oligonucleotide or its conjugated version into the cell. 

Alternatively, a sense or an antisense oligonucleotide may be introduced into a 
cell containing the target nucleic acid sequence by formation of an oligonucleotide- 
lipid complex, as described in WO 90/10448. The sense or antisense oligonucleotide- 
lipid complex is preferably dissociated within the cell by an endogenous lipase. 

POLYPEPTIDES AND FRAGMENTS THEREOF 

The invention also encompasses polypeptides and fragments thereof in various 
forms, including those that are naturally occurring or produced through various 
techniques such as procedures involving recombinant DNA technology. Such forms 
include, but are not limited to, derivatives, variants, and oligomers, as well as fusion 
proteins or fragments thereof. 

The polypeptides of the invention include full length proteins encoded by the 
nucleic acid sequences set forth above. The polypeptides of the invention may be 
membrane bound or they may be secreted and thus soluble. The invention also 
includes the expression, isolation and purification of the polypeptides and fragments 
of the invention, accomplished by any suitable technique. 
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The following examples further illustrate preferred aspects of the invention. 

EXAMPLE 1: Differential G ene Expression Analysis in Prostate PancM- 

Using the differential display technique, we identified a novel gene that is 
over-expressed in prostate cancer cells. Differential display provides a method to 
separate and clone individual messenger RNAs by means of the polymerase chain 
reaction, as described in Liang et al., Science, 257:967-71 (1992), which is hereby 
incorporated by reference. Briefly, the method entails using two groups of 
oligonucleotide primers. One group is designed to recognize the polyadenylate tail of 
messenger RNAs. The other group contains primers that are short and arbitrary in 
sequence and anneal to positions in the messenger RNA randomly distributed from 
the polyadenylate tail. Products amplified with these primers can be differentiated on 
a sequencing gel based on their size. If different cell populations are amplified with 
the same groups of primers, one can compare the amplification products to identify 
differentially expressed RNA sequences. 

Differential display ("DD") kits from Genomyx (Foster City, California) were 
used to analyze differential gene expression. The steps of the differential display 
technique are summarized in Figure 1 . Histologically well defined matched tumor 
and normal prostate tissue sections containing approximately similar proportions of 
epithelial cells were chosen from individual prostate cancer patients. 

Genomic DNA-free total RNA was extracted from this enriched pool of cells 
using RNAzol B (Tel-Test, Inc., Friendswood, TX) according to manufacturer's 
protocol. The epithelial nature of the RNA source was further confirmed using 
cytokeratin 1 8 expression (45) in reverse transcriptase-polymerase chain reaction (RT- 
PCR) assays. Using arbitrary and anchored primers containing 5' Ml 3 or T7 
sequences (obtained from Biomedical Instrumentation Center, Uniformed Services 
University of the Health Sciences, Bethesda), the isolated DNA-free total RNA was 
amplified by RT-PCR which was performed using ten anchored antisense primers and 
four arbitrary sense primers according to the protocol provided by Hieroglyph™ RNA 
Profile Kit 1 (Genomyx Corporation, CA). The cDNA fragments produced by the 
RT-PCR assay were analyzed by high resolution gel electrophoresis, carried out by 
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using Genomyx™ LR DNA sequencer and LR-Optimized™ HR-1000™ gel 
formulations (Genomyx Corporation, CA). 

A partial DD screening of normal/tumor tissues revealed 30 differentially 
expressed cDNA fragments, with 53% showing reduced or no expression in tumor 
RNA specimens and 47% showing over expression in tumor RNA specimen (Figure 
2). These cDNAs were excised from the DD gels, reamplified using T7 and Ml 3 
primers and the RT PCR conditions recommended in Hieroglyph™ RNA Profile Kit-1 
(Genomyx Corp., CA), and sequenced. The inclusion of T7 and Ml 3 sequencing 
primers in the DD primers allowed rapid sequencing and orientation of cDNAs 
(Figure 1). 

All the reamplified cDNA fragments were purified by Centricon-c-100 system 
(Amicon, USA). The purified fragments were sequenced by cycle sequencing and 
DNA sequence determination using an ABI 377 DNA sequencer. Isolated sequences 
were analyzed for sequence homology with known sequences by running searches 
through publicly available DNA sequence databases, including the National Center for 
Biotechnology Information and the Cancer Genome Anatomy Project. Approximately 
two-thirds of these cDNA sequences exhibited homology to previously described 
DNA sequences/genes e.g., ribosomal proteins, mitochondrial DNA sequences, 
growth factor receptors, and genes involved in maintaining the redox state in cells. 
About one-third of the cDNAs represented novel sequences, which did not exhibit 
similarity to the sequences available in publicly available databases. The PCGEM1 
fragment, obtained from the initial differential display screening represents a 530 base 
pair (nucleotides 410 to 940 of SEQ ID NO: 1) cDNA sequence which, in initial 
searches, did not exhibit any significant homology with sequences in the publicly 
available databases. Later searching of the high throughput genome sequence 
(HTGS) database revealed perfect homology to a chromosome 2 derived 
uncharacterized, unfinished genomic sequence (accession # AC 013401). 

EXAMPLE 2: Characterization of Full Length PCGEM1 cDNA Sequence 

The full length of PCGEM1 was obtained by 5' and 3' RACE/PCR from the 
original 530 bp DD product (nucleotides 410 to 940 of PCGEM1 cDNA SEQ ID 
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NO:l) using a normal prostate cDNA library in lambda phage (Clontech, CA). The 
RACE/PCR products were directly sequenced. Lasergene and MacVector DNA 
analysis software were used to analyze DNA sequences and to define open reading 
frame regions. We also used the original DD product to screen a normal prostate 
cDNA library. Three overlapping cDNA clones were identified. 

Sequencing of the cDNA clones was performed on an ABI-310 sequence 
analyzer and a new dRhodamine cycle sequencing kit (PE-Applied Biosystem, CA). 
The longest PCGEM1 cDNA clone, SEQ ID NO:l (Figure 8), revealed 1643 
nucleotides with a potential polyadenylation site, ATTAAA, close to the 3' end 
followed by a poly (A) tail. As noted above, although initial searching of PCGEM1 
gene in publically available DNA databases (e.g., National Center for Biotechnology 
Information) using the BLAST program did not reveal any homology, a recent search 
of the HTGS database revealed perfect homology of PCGEM1 (using cDNA of SEQ 
ID NO: 1) to a chromosome 2 derived uncharacterized, unfinished genomic sequence 
(accession # AC 013401). One of the cDNA clones, SEQ ID NO:2 (Figure 9), 
contained a 123 bp insertion at 278, and this inserted sequence showed strong 
homology (87%) to Alu sequence. It is likely that this clone represented the 
premature transcripts. Sequencing of several clones from RT-PCR further confirmed 
the presence of the two forms of transcripts. 

Sequence analysis did not reveal any significant long open reading frame in 
both strands. The longest ORF in the sense strand was 105 nucleotides (572-679) 
encoding 35 amino acid peptides. However, the ATG was not in a strong context of 
initiation. Although we could not rule out the coding capacity for a very small 
peptide, it is possible that PCGEM1 may function as a non-coding RNA. 

The sequence of PCGEM1 cDNA has been verified by several approaches 
including characterization of several clones of PCGEM1 and analysis of PCGEM1 
cDNAs amplified from normal prostate tissue and prostate cancer cell lines. We have 
also obtained the genomic clones of PCGEM1, which has helped to confirm the 
PCGEM1 cDNA sequence. The complete genomic DNA sequence of PCGEM1 
(SEQ ID NO:8) is shown in Figure 14. In Figure 14 (and in the accompanying 
Sequence Listing), "Y" represents any one of the four nucleotide bases, cylosine, 
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thymine, adenine, or guanine. Comparison of the cDNA and genomic sequences 
revealed the organization of the PCGEM1 transcription unit from three exons (Figure 
15: E, Exon; B: BamHl; H: Hindlll; X: Xbal; R: EcoRI). 

EXAMPLE 3: Manning the Location of PCGEM1 

Using fluorescent in situ hybridization and the PCGEM1 genomic DNA as a 
probe, we mapped the location of PCGEM1 on chromosome 2q to specific region 
2q32 (Figure 7A). Specifically, a Bacterial Artificial Chromosome (BAC) clone 
containing the PCGEM1 genomic sequence was isolated by custom services of 
Genome Systems (St. Louis, Mo). PCGEMl-Bac clone 1 DNA was nick translated 
using spectrum orange (Vysis) as a direct label and flourescent in situ hybridization 
was done using this probe on normal human male metaphase chromosome spreads. 
Counterstaining was done and chromosomal localization was determined based on the 
G-band analysis of inverted 4',6-diamidino-2-phenylindole (DAPI) images. (Figure 
7B: a DAPI counter-stained chromosome 2 is shown on the left; an inverted DAPI 
stained chromosome 2 shown as G-bands is shown in the center; an ideogram of 
chromosome 2 showing the localization of the signal to band 2q32(bar) is shown on 
the right.) NU200 image acquisition and registration software was used to create the 
digital images. More than 20 metaphases were analyzed. 

EXAMPLE 4: Analysis of PCGEM1 Gene Expression in Prostate Cancer 

To further characterize the tumor specific expression of the PCGEM1 
fragment, and also to rule out individual variations of gene expression alterations 
commonly observed in tumors, the expression of the PCGEM1 fragment was 
evaluated on a test panel of matched tumor and normal RNAs derived from the 
microdissected tissues of twenty prostate cancer patients. 

Using the PCGEM1 cDNA sequence (SEQ ID NO:l), specific PCR primers 
(Sense primer 1 (SEQ ID NO: 5): 5' TGCCTCAGCCTCCCAAGTAAC 3' and 
Antisense primer 2 (SEQ ID NO: 6): 5' GGCCAAAATAAAACCAAACAT 3') were 
designed for RT-PCR assays. Radical prostatectomy derived OCT compound (Miles 
Inc. Elkhart, IN) embedded fresh frozen normal and tumor tissues from prostate 
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cancer patients were characterized for histopathology by examining hematoxylin and 
eosin stained sections (46). Tumor and normal prostate tissues regions representing 
approximately equal number of epithelial cells were dissected out of frozen sections. 
DNA-free RNA was prepared from these tissues and used in RT-PCR analysis to 
detect PCGEM1 expression. One hundred nanograms of total RNA was reverse 
transcribed into cDNA using RT-PCR kit (Perkin-Elmer, Foster, CA). The PCR was 
performed using Amplitaq Gold from Perkin-Elmer (Foster, CA). PCR cycles used 
were: 95°C for 1 0 minutes, 1 cycle; 95°C for 30 seconds, 55°C for 30 seconds, 72°C 
for 30 seconds, 42 cycles, and 72 °C for 5 minutes, 1 cycle followed by a 4°C storage. 
Epithelial cell-associated cytokeratin 18 was used as an internal control. 

RT-PCR analysis of microdissected matched normal and tumor tissue derived 
RNAs from 23 CaP patients revealed tumor associated overexpression of PCGEM1 in 
13 (56%) of the patients (Figure 5). Six of twenty-three (26%) patients did not exhibit 
detectable PCGEM1 expression in either normal or tumor tissue derived RNAs. 
Three of twenty-three (13%) tumor specimens showed reduced expression in tumors. 
One of the patients did not exhibit any change. Expression of housekeeping genes, 
cytokeratin- 1 8 (Figure 3) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 
(data not shown) remained constant in tumor and normal specimens of all the patients 
(Figure 3). These results were further confirmed by another set of PCGEM1 specific 
primers (Sense Primer 3 (SEQ ID NO: 7): 5' TGGCAACAGGCAAGCAGAG 3' and 
Antisense Primer 2 (SEQ ID NO: 6): 5' GGCCAAAATAAAACCAAACAT 3'). 
Four of 1 6 (25%) patients did not exhibit detectable PCGEM1 expression in either 
normal or tumor tissue derived RNAs. Two of 16 (12.5%) tumor specimens showed 
reduced expression in tumors. These results of PCGEM1 expression in tumor tissues 
could be explained by the expected individual variations between tumors of different 
patients. Most importantly, initial DD observations were confirmed by showing that 
45% of patients analyzed did exhibit over expression of PCGEM1 in tumor prostate 
tissues when compared to corresponding normal prostate tissue of the same 
individual. 
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EXAMPLE 5: In situ Hybridization 

In situ hybridization was performed essentially as described by Wilkinson and 
Green (48). Briefly, OCT embedded tissue slides stored at -80°C were fix^d in 4% 
PFA (paraformaldehyde), digested with proteinase K and then again fixed in 4% PFA. 
After washing in PBS, sections were treated with 0.25% acetic anhydride in 0.1 M 
triethanolamine, washed again in PBS, and dehydrated in a graded ethanol series. 
Sections were hybridized with 35 S-labeled riboprobes at 52*C overnight. After 
washing and RNase A treatment, sections were dehydrated, dipped into NTB-2 
emulsion and exposed for 1 1 days at 4 # C. After development, slides were lightly 
stained with hematoxylin and mounted for microscopy. In each section, PCGEM1 
expression was scored as percentage of cells showing 35 S signal: 1+, 1-25%; 2+, 25- 
50%; 3+, 50-75%, 4+, 75-100%. 

Paired normal (benign) and tumor specimens from 13 patients were tested 
using in situ hybridization. A representative example is shown in Figure 17. In 1 1 
cases (84%) tumor associated elevation of PCGEM1 expression was detected. In 5 of 
these 1 1 patients the expression of PCGEM1 increased to 1+ in the tumor area from 
an essentially undetectable level in the normal area (on the 0 to 4+ scale). Tumor 
specimens from 4 of 1 1 patients scored between 2+ (example shown in Figure 17B) 
and 4+. Two of 1 1 patients showed focal signals with 3+ score in the tumor area, and 
one of these patients had similar focal signal (2+) in an area pathologically designated 
as benign. In the remaining 2 of the 13 cases there was no detectable signal in any of 
the tissue areas tested. The results indicate that PCGEM1 expression appears to be 
restricted to glandular epithelial cells. (Figure 1 7 shows an example of in situ 
hybridization of 35 S labeled PCGEM1 riboprobe to matched normal (A) versus tumor 
(B) sections of prostate cancer patients. The light gray areas are hematoxylin stained 
cell bodies, the black dots represent the PCGEM1 expression signal. The signal is 
background level in the normal (A), 2+ level in the tumor (B) section. The 
magnification is 40x.) 
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EXAMPLE 6: PCGEM1 Gene Exp ression in Prostate Tumor T*H l in* 

PCGEM1 gene expression was also evaluated in established prostate cancer 
cell lines: LNCaP, DU145, PC3 (all from ATCC), DuPro (available from Dr. David 
Paulson, Duke University, Durham, NC), and an E6/E7 - immortalized primary 
prostate cancer cell line, CPDR1 (47). CPDR1 is a primary CaP derived cell line 
immortalized by retroviral vector, LXSN 16 E6 E7, expressing E6 and E7 gene of the 
human papilloma virus 16. LNCaP is a well studied, androgen-responsive prostate 
cancer cell line, whereas DU145, PC3, DuPro and CPDR1 are androgen-independent 
and lack detectable expression of the androgen receptor. Utilizing the RT-PCR assay 
described above, PCGEM1 expression was easily detectable in LNCaP (Figure 4). 
However, PCGEM1 expression was not detected in prostate cancer cell lines DU145, 
PC3, DuPro and CPDR. Thus, PCGEM1 was expressed in the androgen-responsive 
cell line but not in the androgen-independent cell lines. These results indicate that 
hormones, particularly androgen, may play a key role in regulating PCGEM1 
expression in prostate cancer cells. In addition, the results suggest that PCGEM1 
expression may be used to distinguish between hormone responsive tumor cells and 
more aggressive hormone refractory tumor cells. 

To test if PCGEM1 expression is regulated by androgens, we performed 
experiments evaluating PCGEM1 expression in LNCaP cells (ATCC) cultured with 
and without androgens. Total RNA from LNCaP cells, treated with synthetic 
androgen Rl 881 obtained from (DUPONT, Boston, MA), were analyzed for 
PCGEM1 expression. Both RT-PCR analysis (Figure 5a) and Northern blot analysis 
(Figure 5b) were conducted as follows. 

LNCaP cells were maintained in RPMI 1640 (Life Technologies, Inc., 
Gaithersburg, MD) supplemented with 10% fetal bovine serum (FBS, Life 
Technologies, Inc., Gaithersburg, MD) and experiments were performed on cells 
between passages 20 and 35. For the studies of NKX3.1 gene expression regulation, 
charcoal/dextran stripped androgen-free FBS (cFBS, Gemini Bio-Products, Inc., 
Calabasas, CA) was used. LNCaP cells were cultured first in RPMI 1640 with 10% 
cFBS for 4 days and then stimulated with a non-metabolizable androgen analog 
R1881 (DUPONT, Boston, MA) at different concentrations for different times as 



BNSDOCIO: <WO_0058470A1 _|_> 



WO 00/58470 PCT/US00/07906 

27 

shown in Figure 5 A. LNCaP cells identically treated but without Rl 881 served as 
control. Poly A+ RNA derived from cells treated with/without R1881 was extracted 
at indicated time points with RNAzol B (Tel-Test, Inc, TX) and fractionated 
(2^ig/lane) by running on 1% formaldehyde-agarose gel and transferred to nylon 
membrane. Northern blots were analyzed for the expression of PCGEM1 using the 
nucleic acid molecule set forth in SEQ ID NO: 4 as a probe. The RNA from LNCaP 
cells treated with R1881 and RNA from control LNCaP cells were also analyzed by 
RT-PCR assays as described in Example 4. 

As set forth in Figures 5a and Sb, PCGEM1 expression increases in response 
to androgen treatment. This finding further supports the hypothesis that the PCGEM1 
expression is regulated by androgens in prostate cancer cells. 

EXAMPLE 7: Tissue Specificity of PCGEM1 Expression 

Multiple tissue Northern blots (Clontech, CA) conducted according to the 
manufacturer's directions revealed prostate tissue-specific expression of PCGEM1. 
Polyadenylate RNAs of 23 different human tissues (heart, brain, placenta, lung, liver 
skeletal muscle, kidney, pancreas, spleen, thymus, prostate, testis, ovary, small 
intestine, colon, peripheral blood, stomach, thyroid, spinal cord, lymph node, trachea, 
adrenal gland and bone marrow) were probed with the 530 base pair PCGEM1 cDNA 
fragment (nucleotides 410 to 940 of SEQ ID NO:l). A 1.7 kilobase mRNA transcript 
hybridized to the PCGEM1 probe in prostate tissue (Figure 6a). Hybridization was 
not observed in any of the other human tissues (Figure 6a). Two independent 
experiments revealed identical results. 

Additional Northern blot analyses on an RNA master blot (Clontech, CA) 
conducted according to the manufacturer's directions confirm the prostate tissue 
specificity of the PCGEM1 gene (Figure 6b). Northern blot analyses reveal that the 
prostate tissue specificity of PCGEM1 is comparable to the well known prostate 
marker PSA (77mer oligo probe) and far better than two other prostate specific genes 
PSMA (234 bp fragment from PCR product) and NKX3.1 (210 bp cDNA). For 
instance, PSMA is expressed in the brain (37) and in the duodenal mucosa and a 
subset of proximal renal tubules (38). While NKX3.1 exhibits high levels of 
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expression in adult prostate, it is also expressed in lower levels in testis tissue and 
several other tissues (39). 



EXAMPLE 8: Biologic functions of the PCCKiyfl 

The tumor associated PCGEM1 overexpression suggested that the increased 
expression of PCGEM1 may favor tumor cell proliferation. NIH3T3 cells have been 
extensively used to define cell growth promoting functions associated with a wide 
variety of genes (40-44). Utilizing pcDNA3.1/Hygro(+/.)(Invitrogen, CA), PCGEM1 
expression vectors were constructed in sense and anti-sense orientations and were 
transfected into NIH3T3 cells, and hygromycin resistant colonies were counted 2-3 
weeks later. Cells transfected with PCGEM1 sense construct formed about 2 times 
more colonies than vector alone in three independent experiments (Figure 1 0). The 
size of the colonies in PCGEM1 sense construct transfected cells were significantly 
larger. No appreciable difference was observed in the number of colonies between 
anti-sense PCGEM1 constructs and vector controls. These promising results 
document a cell growth promoting/cell survival ftmction(s) associated with PCGEM1. 

The function of PCGEM1 , however, does not appear to be due to protein 
expression. To assess this hypothesis, we used the TestCode program (GCG 
Wisconsin Package, Madison, WI), which identifies potential protein coding 
sequences of longer than 200 bases by measuring the non-randomness of the 
composition at every third base, independently from the reading frames. Analysis of 
the PCGEM1 cDNA sequence revealed that, at greater than 95% confidence level, the 
sequence does not contain any region with protein coding capacity (Figure 16A). 
Similar results were obtained when various published non-coding RNA sequences 
were analyzed with the TestCode program (data not shown), while known protein 
coding regions of similar size i.e., alpha actin (Figure 16B) can be detected with high 
fidelity. (In Figure 16, evaluation of the coding capacity of the PCGEM1 (A) and the 
human alpha actin (B), is performed independently from the reading frame, by using 
the TestCode program. The number of base pairs is indicated on the X- axis, the 
TestCode values are shown on the Y-axis. Regions of longer than 200 base pairs 
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above the upper line (at 9.5 value) are considered coding, under the lower line (at 7.3 
value) are considered non-coding, at a confidence level greater than 95%.) 

The Codon Preference program (GCG Wisconsin Package, Madison, WI), 
which locates protein coding regions in a reading frame specific manner further 
suggested the absence of protein coding capacity in the PCGEM1 gene (see 
www.cpdr.org). In vitro transcription/translation of PCGEM1 cDNA did not produce 
a detectable protein/peptide. Although we can not unequivocally rule out the 
possibility that PCGEM1 codes for a short unstable peptide, at this time both 
experimental and computational approaches strongly suggest that PCGEM1 cDNA 
does not have protein coding capacity. (It should be recognized that conclusions 
regarding the role of PCGEM1 are speculative in nature, and should not be considered 
limiting in any way. 

The most intriguing aspect of PCGEM1 characterization has been its apparent 
lack of protein coding capacity. Although we have not completely ruled out the 
possibility that PCGEM1 codes for a short unstable peptide, careful sequencing of 
PCGEM1 cDNA and genomic clones, computational analysis of PCGEM1 sequence, 
and in vitro transcription/translation experiments (data not shown) strongly suggest a 
non-coding nature of PCGEM1 . It is interesting to note that an emerging group of 
novel mRNA-like non-coding RNAs are being discovered whose function and 
mechanisms of action remain poorly understood (49). Such RNA molecules have also 
been termed as "RNA riboregulators" because of their function(s) in development, 
differentiation, DNA damage, heat shock responses and tumorigenesis (40-42, 50). In 
the context of tumorigenesis, the H19 t His-] and Bic genes code for functional non- 
coding mRNAs (50). In addition, a recently reported prostate cancer associated gene, 
DD3 also appears to exhibit a tissue specific non-coding mRNA (51). In this regard it 
is important to point out that PCGEM1 and DD3 may represent a new class of 
prostate specific genes. The recent discovery of a steroid receptor -co-activator as an 
mRNA, lacking protein coding capacity further emphasizes the role of RNA 
riboregulators in critical biochemical function(s) (52). Our preliminary results 
showed that PCGEM1 expression in NIH3T3 cells caused a significant increase in the 
size of colonies in a colony forming assay and suggests that PCGEM1 cDNA confers 
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cell proliferation and/or cell survival ftinction(s). Elevated expression of PCGEM1 in 
prostate cancer cells may represent a gain in function favoring tumor cell 
proliferation/survival. On the basis of our first characterization of PCGEMlgene, we 
propose that PCGEM1 belongs to a novel class of prostate tissue specific genes with 
potential functions in prostate cell biology and the tumorigenesis of the prostate gland. 

In summary, utilizing surgical specimens and rapid differential display 
technology, we have identified candidate genes of interest with differential expression 
profile in prostate cancer specimens. In particular, we have identified a novel 
nucleotide sequence, PCGEM1, with no match in the publicly available DNA 
databases (except for the homology shown in the high throughput genome sequence 
database, discussed above). A PCGEM1 cDNA fragment detected a 1.7 kb mRNA on 
Northern blots with selective expression in prostate tissue. Furthermore, this gene 
was found to be up-regulated by the synthetic androgen, Rl 88 1 . Careful analysis of 
microdissected matched tumor and normal tissues further revealed PCGEM1 over- 
expression in a significant percentage of prostate cancer specimens. Thus, we have 
provided a gene with broad implications for the diagnosis, prevention, and treatment 
of prostate cancer. 

The specification is most thoroughly understood in light of the teachings of the 
references cited within the specification which are hereby incorporated by reference. 
The embodiments within the specification provide an illustration of embodiments of 
the invention and should not be construed to limit the scope of the invention. The 
skilled artisan readily recognizes that many other embodiments are encompassed by 
the invention. 
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We claim: 

1 . An isolated nucleic acid molecule selected from: 

(a) the polynucleotide sequence of SEQ ID NO: 1 , SEQ ID NO:2, or 

SEQIDNO:8; 

(b) an isolated nucleic acid molecule that hybridizes to either strand of 
a denatured, double-stranded DNA comprising the nucleic acid sequence of (a) under 
conditions of moderate stringency in about 50% formamide and about 6X SSC at 
about 42°C with washing conditions of approximately 60°C, about 0.5X SSC, and 
about 0.1% SDS; 

(c) an isolated nucleic acid molecule that hybridizes to either strand of 
a denatured, double-stranded DNA comprising the nucleic acid sequence of (a) under 
conditions of high stringency in about 50% formamide and about 6X SSC, with 
washing conditions of approximately 68°C, about 0.2X SSC, and about 0.1% SDS; 

(d) an isolated nucleic acid molecule derived by in vitro mutagenesis 
from SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO:8; 

(e) an isolated nucleic acid molecule degenerate from SEQ ID NO:l , 
SEQ ID NO:2, or SEQ ID NO:8, as a result of the genetic code; and 

(f) an isolated nucleic acid molecule selected from the group consisting 
of human PCGEM1 DNA, an allelic variant of human PCGEM1 DNA, and a species 
homolog ofPCGEMl DNA. 

2. A recombinant vector that directs the expression of the nucleic acid 
molecule of claim 1. 

3. A host cell transfected or transduced with the vector of claim 2. 

4. The host cell of claim 3 selected from bacterial cells, yeast cells, and 
animal cells. 

5. An isolated nucleic acid molecule comprising the polynucleotide sequence 
selected from SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID 
NO: 7, SEQ ID NO: 9, SEQ ID NO: 1 0, SEQ ID NO: 1 1 , SEQ ID NO: 1 2, SEQ ID 
NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID 
NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, and SEQ ID NO: 22. 

6. A method of detecting prostate cancer in a patient, the method comprising: 
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(a) detecting PCGEM1 mRNA in a biological sample from the 
patient; and 

(b) correlating the amount of PCGEM 1 mRNA in the sample with 
the presence of prostate cancer in the patient. 

7. The method according to claim 6, wherein step (a) includes: 

(a) isolating RNA from the sample; 

(b) amplifying a PCGEM 1 cDNA molecule; 

(c) incubating the PCGEM 1 cDNA with the nucleic acid according 
to claim 1 or 5; and 

(d) detecting hybridization between the PCGEM 1 cDNA and the 
nucleic acid. 

8. The method according to claim 7, wherein the PCGEM1 cDNA is 
amplified with at least two nucleotide sequences selected from SEQ ID NO: 5, SEQ 
ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1 , SEQ ID 
NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID 
NO: 17, SEQ ID NO: 1 8, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, and SEQ 
ID NO: 22. 

9. The method according to claim 8, wherein the at least two nucleotide 
sequences are SEQ ID NO:15 and SEQ ID NO:22. 

10. A method according to claim 6, wherein the biological sample is selected 
from blood, urine, and prostate tissue. 

1 1 . The method according to claim 10, wherein the biological sample is 

blood. 

12. A vector, comprising a PCGEM 1 promoter sequence operatively linked to 
a nucleotide sequence encoding a cytotoxic protein. 

13. The vector of claim 12, wherein the PCGEM 1 promoter sequence is a 
nucleic acid molecule comprising the polynucleotide sequence of SEQ ID NO:3. 

14. A method of selectively killing a prostate cancer cell, the method 
comprising: 

(a) introducing the vector according to claim 12 to the prostate cancer 
cell under conditions sufficient to permit selective cell killing. 
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1 5. The method according to claim 14, wherein the cytotoxic protein is 
selected from ricin, abrin, diphtheria toxin, p53, thymidine kinase, tumor necrosis 
factor, cholera toxin, Pseudomonas aeruginosa exotoxin A, ribosomal inactivating 
proteins, and mycotoxins. 

16. A method of identifying an androgen-responsive cell line, the method 
comprising: 

(a) obtaining a cell line suspected of being androgen responsive, 

(b) incubating the cell line with an androgen; and 

(c) detecting PCGEM1 mRNA in the cell line, 

wherein an increase in PCGEM1 mRNA, as compared to an untreated cell 
line, correlates with the cell line being androgen responsive. 

1 7. A method of measuring the responsiveness of a prostate tissue to 
hormone-ablation therapy, the method comprising: 

(a) treating the prostate tissue with hormone ablation therapy; and 

(b) measuring PCGEM 1 mRNA in the prostate tissue following 
hormone ablation therapy, 

wherein a decrease in PCGEM 1 mRNA, as compared to an untreated cell line, 
correlates with the prostate tissue responding to hormone ablation therapy. 
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cDNA sequence of PCGEM1 Sea. ID No .1 

AAGGCACTCT GGCACCCAGT TTTGGAACTG CAGTTTTAAA AGTCATAAAT TGMTGAAAA TGATAGCAAA 70 

GGTGGAGGTT TTTAAAGAGC TATTTATAGG TCCCTGGACA GCATCTTTTT TCMTTAGGC AGCAACCTTT 140 

TTGCCCTATG CCGTAACCTG TGTCTGCAAC TTCCTCTAAT TGGGAAATAG TTAAGCAGAT TCATAGAGCT 210 

GAATGATAAA ATTGTACTAC GAGATGCACT GGGACTCAAC GTGACCTTAT CAAGTGAGCA GGCTTGGTGC 280 

ATTTGACACT TCATGATATC ATCGAAAGTG GAACTAAAAA CAGCTCCTGG AAGAGGACTA TGACATCATC 350 

AGGTTGGGAG TCTCCAGGGA CAGCGGACCC TTTGGAAAAG GACTAGAAAG TGTGAAATCT ATTAGTCTTC 420 

GATATGAAAT TCTCTGTCTC TGTAAAAGGA TTTCATATTT ACAAGACACA GGCCTACTCC TAGGGCAGCA 490 

AAAAGTGGCA ACAGGCAAGC AGAGGGAAAA GAGATCATGA GGCATTTCAG AGTGCACTGT CTTTTCATAT 560 

ATTTCTCAAT GCCGTATGTT TGGTTTTATT TTGGCCAAGC ATAACAATCT GCTCAAGAAA AAAAAATCTG 630 

GAGAAAACAA AGGTGCCTTT GCCAATGTTA TGTTTCTTTT TGACAAGCCC TGAGATTTCT GAGGGGAATT 700 

CACATAAATG GGATCAGGTC ATTCATTTAC GTTGTGTGCA AATATGATTT AAAGATACAA CCTTTGCAGA 770 

GAGCATGCTT TCCTAAGGGT AGGCACGTGG AGGACTAAGG GTAAAGCATT CTTCAAGATC AGTTAATCAA 840 

GAAAGGTGCT CTTTGCATTC TGAAATGCCC TTGTTGCAAA TATTGGTTAT ATTGATTAAA TTTACACTTA 910 

ATGGAAACAA CCTTTAACTT ACAGATGAAC AAACCCACAA AAGCAAAAAA TCAAAAGCCC TACCTATGAT 980 

TTCATATTTT CTGTGTAACT GGATTAAAGG ATTCCTGCTT GCTTTTGGGC ATAAATGATA ATGGAATATT 1050 

TCCAGGTATT GTTTAAAATG AGGGCCCATC TACAAATTCT TAGCAATACT TTGGATAATT CTAAAATTCA 1120 

GCTGGACATT GTCTAATTGT TTTTTATATA CATCTTTGCT AGAATTTCAA ATTTTAAGTA TGTGAATTTA 1190 

GTTAATTAGC TGTGCTGATC AATTCAAAAA CATTACTTTC CTAAATTTTA GACTATGAAG GTCATAAATT 1260 

CAACAAATAT ATCTACACAT ACAATTATAG ATTGTTTTTC ATTATAATGT CTTCATCTTA ACAGAATTGT 1330 

CTTTGTGATT GTTTTTAGAA AACTGAGAGT TTTAATTCAT AATTACTTGA TCAAAAAATT GTGGGAACAA 1400 

TCCAGCATTA ATTGTATGTG ATTGTTTTTA TGTACATAAG GAGTCTTAAG CTTGGTGCCT TGAAGTCTTT 1470 

TGTACTTAGT CCCATGTTTA AAATTACTAC TTTATATCTA AAGCATTTAT GTTTTTCAAT TCAATTTACA 1540 

TGATGCTAAT TATGGCAATT ATAACAAATA TTAAAGATTT CGAAATAGAA AAAAAAAAAA AAA 1603 
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cDNA sequence of PCGEM1 Seo. ID No ,2 

GCGGCCGCGT CGACGCAACT TCCTCTAATT GGGAAATAGT TAAGCAGATT CATAGAGCTG MTGATAAAA 70 

TTGTACTTCG AGATGCACTG GGACTCAACG TGACCTTATC AAGTGAGATG GAGTCTTGCC CTGTCTCCAA 140 

GGCTGGAGCC CAATGGTGTG ATCTTGGCTC ACTGCAACCT CCACCTCCCA GGTTCAAACG TTTCTCCTGC 210 

CTCAGCCTCC CAAGTAACTG GGATTACAGC AGGCTTGGTG CATTTGACAC TTCATGATAT CAGCCAAAGT 280 

GGAACTAAAA ACAGCTCCTG GAAGAGGACT ATGACATCAT CAGGTTGGGA GTCTCCAGGG ACAGCGGACC 350 

CTTTGGAAAA GGACTAGAAA GTGTGAAATC TATTAGTCTT CGATATGAAA TTCTCTGTCT CCGTAAAAGC 420 

ATTTCATATT TACAAGACAC AGGCCTACTC CTAGGGCAGC AAAAAGTGGC AACAGGCAAG CAGAGGGAAA 490 

AGAGATCATG AGGCATTTCA GAGTGCACTG TCTTTTCATA TATTTCTCAA TGCCGTATGT TTGGTTTTAT 560 

TTTGGCCAAG CATAACAATC TGCTCAAAAA AAAAAAATCT GGAGAAAACA AAGGTGCCTT TGCCAATGTT 630 

ATGTTTCTTT TTGACAAGCC CTGAGATTTC TGAGGGGAAT TCACATAAAT GGGATCAGGT CATTCATTTA 700 

CGTTGTGTGC AAATATGATT TAAAGATACA ACCTTTGCAG AGAGCATGCT TTCCTAAGGG TAGGCACGTG 770 

GAGGACTAAG GGTAAAGCAT TCTTCAAGAT CAGTTAATCA AGAAAGGTGC TCTTTGCATT CTGAAATGCC 840 

CTTGTTGCAA ATATTGGTTA TATTGATTAA ATTTACACTT AATGGAAACA ACCTTTAACT TACAGATGAA 910 

CAAACCCGAC AAAAGCAAAA AATCAAAAGC CCTACCTATG ATTTCATATT TTCTGTGTAA CTGGATTAAA 980 

GGATTCCTGC TTGCTTTTGG GCATAAATGA TAATGGAATA TTTCCAGGTA TTGTTTAAAA TGAGGGCCCA 1050 

TCTACAAATT CTTAGCAATA CTTTGGATAA TTCTAAAATT CAGCTGGACA TTGTCTAATT GTTTTTTATA 1120 

TACATCTTTG CTAGAATTTC AAATTTTAAG TATGTGAATT TAGTTAATTA GCTGTGCTGA TCAATTCAAA 1190 

AACATTACTT TCCTAAATTT TAGACTATGA AGGTCATAAA TTCAACAAAT ATATCTACAC ATACAATTAT 1260 

AGATTGTTTT TCATTATAAT GTCTTCATCT TAACAGAATT GTCTTTGTGA TTGTTTTTAG AAAACTGAGA 1330 

GTTTTAATTC ATAATTACTT GATCAAAAAA TTGTGGGAAC AATCCAGCAT TAATTGTATG TGATTGTTTT 1400 

TATGTACATA AGGAGTCTTA AGCTTGGTGC CTTGAAGTCT TTTGTACTTA GTCCCATGTT TAAAATTACT 1470 

ACTTTATATC TAAAGCATTT ATGTTTTTCA ATTCAATTTA CATGATGCTA ATTATGGCAA TTATAACAAA 1540 

TATTAAAGAT TTCGAAATAG AAAAAAAAAA AAAAATCTA 1579 

FIG. 9 
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cDNA sequence of PCGEM1 Promoter Region Sea.ID No. 3 



PCT/USOO/07906 



Tcccrcmc gttctgcaat ttctgaaaaa aagatgttta TTGCAAAGTG ATATGAGCAC TGGAAAGGTA 70 

CTMTTCCAA TTTGATTCTA ATTGGATGAG TGACATGGGT AAGCGATTCT AAGCATTTGT GTTTTTTTTA 140 

GTAGTATGGA ATTTAATTAG TTCTCAGTAT GTTAGTGAAG ATGAATGAAA ACATGCATAT GTTTCCATGT 210 

ATTATAAATA TTTTAAAATG CAAAAAATTA TTCTAATGAA TATATAAATA TAAAGCATAA CAATAATAAT 280 

ACAATACCAC CCATAAAGTC ATCATCTAAT TTAAAAACTA AAACATTAAC ACTTGAATCT CCCCCATTGC 350 

AACATCTTTC CCGACTTGTG TGTTTTTTTC TTTTGCTTTT AAAATTTTTG TTTTATCATA TGTCTGCATA 420 

AGATTATATA GCTTTCCTTG TTTTAAGCTT TTTAAATAAT ATATTGTAGT TATATTATTT GTGCTTTGCT 490 

TTTTTTACTT AACATTATGG TTCTAAAATT CAGTAATGTG TTGGGCATGT ATAATTTGTT TATTTTTAAT 560 

CTCTTTGACA TTCGACTATA TAAATTTCAG TTTGTTTATT GACTCCTTTG TCTATAGATA CTCTGCTATT 630 

TCTGTTTTTG CTGTTACAAA AATAATGCTG TTTTAAATTT CATTTTGTAT ACTTTTTTGA GGCATGTGTA 700 

TGAGTTATTC TAAGGTAAAA AAATAAGAAA AAATTGCTGG GTTATAAGAT TGTCACATGC TCGAATTTAC 770 

AAGATAATGC CAAATCATTT TTCAAAGTAA TTATACCTAT TTATACTACC GGTATGAGTA TATTGGTGCC 840 

CACATAGTTG CTTGTTCTGC CAAAGTTTGG TATGATCGAA CAATAATTTT TGCCCATCAA ATGGCATAAA 910 

ATAAAATCTC AGTGTGCTTT TAATTTGCAT TTTCTATGTT TAAGAATTGT TTCTTTTTTA ACCATTTATA 980 

ATTTACTTTT GCTGAAATGC TTGCTTATTA TTTTTGCTCC CCATTTTTTC CTATTGGATT GCTTTTCTCA 1050 

TTAATTTATA AGAATTTTAT ATGGTTTAGA TACTAATTAT TATATTACTG AAAATACCTT TATCAGTTTG 1120 

TTGTGTACTT TCTACTTTAT GTCTTGTGAT GGATAAAAGT TTTAAATTGT ATTGTGTTGA AGTTAACATT 1190 

TTTAAATTTT ATAATCAGCA TCTTTAATAA TCTCTTTMTA AAATTTTCCT TTACATAGAT GTCATAAAGA 1260 

TACATCTCTA TAATTTCTTA TTTTTTTGGC ATATGTTCAT TAAGTCATTT TATCATTTTT TAGTAATAAA 1330 

TTGCAGTTAT TTATGAAACA AATAATTTTT AAAATTATAT ATGCTTTCTT TAAAAATTGA TCTTAGCATG 1400 

CTTCACTATG AAGCTTGAGG CTTCACTGCA CGTTGTACTG AAATTATGTA TAAAACAGTG GTTCTGAAAA 1470 

TCTCTGAGTT CATGACACCT TTAGTGTCTC AGGTTTTTTT GCTTTTGTTC TTGTTTTTTC TCACAAAGCA 1540 

CCTAAGTTAA ATAAAAACAA AGCACAAAGC TATCAGCTTC ATGTATTAAG TAGTAAGCTC GCATGTTAAC 1610 

AGTTGTAACT TGCCTGGTGC CCAATAGATG TCACTCTGTT TTCCTAGAAA CTTTAAAATA TCCCTCAGTG 1680 

CTCCTGTTAA TTCATGGTAG TGCCCCAAGG CACTCTGGCA CCCAGTTTTG GAACTGCAGT TTTAAAAGTC 1750 

ATAAATTGAA TGAAAATGAT AGCAAAGGTG GAGGTTTTTA A AGAGCTATT TATAGGTCCC TGGACAGCA 1819 
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cDNA sequence of PCGEM1 PROBE Sea, ID No .4 

TTTTTTCAAT TAGGCAGCAA CCTTTTTGCC CTATGCCGTA ACCTGTGTCT GCAACTTCCT CTAATTGGGA 70 

AATAGTTAAG CAGATTCATA GAGCTGAATG ATAAAATTGT ACTACGAGAT GCACTGGGAC TCAACGTCAC 140 

CTTATCAAGT GAGCAGGCTT GGTGCATTTG ACACTTCATG ATATCATCCA AAGTGGAACT AAAAACAGCT 210 

CCTGGAAGAG GACTATGACA TCATCAGGTT GGGAGTCTCC AGGGACAGCG GACCCTTTGG AAAAGGACTA 280 

GAAAGTGTGA AATCTATTAG TCTTCGATAT GAAATTCTCT GTCTCTGTAA AAGCATTTCA TATTTACAAG 350 

ACACAGGCCT ACTCCTAGGG CAGCAAAAAG TGGCAACAGG CAAGCAGAGG GAAAAGAGAT CATGAGGCAT 420 

TTCAGAGTGC ACTGTCTTTT CATATATTTC TCAATGCCGT ATGTTTGGTT TTATTTTGGC CAAGCATAAC 490 

AATCTGCTCA AGAAAAAAAA ATCTGGAGAA AACAAAGGTG CCTTTGCCAA TGTTATGTTT CTTTTOACA 560 

AGCCCTGAGA TTTCTGAGGG GAATTCACAT AAATGGGATC AGGTCATTCA TTTACGTTGT GTGCAAATAT 630 

GATTTAAAGA TACAACCTTT GCAGAGAGCA TGCTTTCCTA AGGGTAGGCA CGTGGAGGAC TAAGGGTAAA 700 

GCATTCTTCA AGATCAGTTA ATCAAGAAAG GTGCTCTTTG CATTCTGAAA TGCCCTTGTT GCAAATATTG 770 

GTTATATTGA TTAAATTTAC ACTTAATGGA AACAACCTTT AACTTACAGA TGMCAAACC CACAAAAGCA 840 

AAAAATCAAA AGCCCTACCT ATGATTTCAT ATTTTCTGTG TAACTGGATT AAAGGATTCC TGCTTGCTTT 910 

TGGGCATAAA TGATAATGGA ATATTTCCAG GTATTGTTTA AAATGAGGGC CCATCTACAA ATTCTTAGCA 980 

ATACTTTGGA TAATTCTAAA ATTCAGCTGG ACATTGTCTA ATTGT 1025 



FIG. 12 
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PCGEM1 Primers Used for PCR 

PCR PRIMER 1 (SEP ID No. 5) 

Sense Primer 5' TGCCTCAGCCTCCCAAGTAAC 3' 

PCR PRIMER 2 (SEP ID No. 6) 

Antisense Primers 5' GGCCAAAATAAAACCAAACAT 3' 

PCR PRIMER 3 (SEP ID No. 7) 

Sense Primer 5' TGGCAACAGGCAAGCAGAG 3' 

FIG. 13 
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Complete Genomic DNA sequence of PCGEM1 gene. 

TCCCTCTTCCGTTCTKMTTra 

TTTGATTCTMTTOTGAGTGACATGOT 

TTCTCAGTATGTTAGTGMGATGMTGAAMCATGCATATGTTTCCATGTATTAT^ 

TTCTMTGAATATATMTATAAAGCATMCMTMTMTACMTACCACm 

AMCATtMCACTrcMTCTCCCCCATTGCMCATGTTTCCCGACTTC^ 

TTTTATCATATGTCTGCATMGATTATATA(X!TTTCCTTGTTTTM 

GTKTTTGCTTTTTTTACTTMOT 

CTCTTTGACATTCGACTATATMTTTCAGTTTGTTTATTGACTCCT^ 
CTGTTACAAAMTMTGCTGTTTTAMTTTCATTTTGTATACTTTTTTGA 
AMTMGAAAAMTTGCTt&GTTATMGATTO 

TTATACCTATTTATACTACCGGTATGAGTATATTGGTGCCCACATAGITGCTTGTTC 
CMTMTTTTTKCCATCAAATG^ATAAMTAAM^^ 

TTCTTTTTTMCCATnATMTTTACTTTTGCTGAAATGOT 

(nTTTCltATTMTTTATMGAATTTTATATj^ 

TTGTGTACTTTCTACTTTATGTCTTGTGATGGATAAMGTTT^ 

ATMTCAGCATCTTTMTMTCTCTTTATAMTTTTCCTTTACATAGATC 
TTTTTTTGGCATATGTTCATTMGTCATTTTATCAT1T 

AAMTTATATATGCTWCTITAAAMTTGATCITAGCATGCTTCACTATC 

TTGTTTTTTGTCACAAAGCACCTAAOTAAATAAA^ 

CCATGTTMCAGTTGTAACTTGCC 

GTCCTGTTMTTCATGGTAGTGCCCCMGGCACTCTGGCACCCAGTTTTGGMCTG^ 
TGAAAATGATAGCMGGTG(^GGTTTTTAMGAGCTATTTATACCTCCC1^ 
ACCITTTTGCCTATGCCGTMCTGTGTCTGCACTTCCfCTAATTO 
TMGAATATAGTMTMTCCCTTAAATCATGGTTATTOT 

AAAGTATMAGTOAGTGTAAm 
GCTTTMGTCTGMTGCAGAGCATGGATGTOTGATCCAGCC 
CCTTTTGAGAMCACATTTGGCATTGTMTATGTTTTGC 
ATAMTTTATTTTCAGGGCACACAGTTTCCCTTTTAGGGM 

TATTCAGTMGTGAGGTCCTCATAGATCTTATGTGTATGTCkCATGTATATMW 
(M3AMTTTGAGGAATCTTMCTA^^ 

TTAMTACCTGCATGGGAGAATCATTG^^ 
GTTATTCTGGtTGATTTTOTTTTCA 
ATATACAGAAATAGTTMGCAGATTCATAGAGCTGAATATAA^ 
ATCMGTGACTTATCAGTGA(£TGAGCATTOT^ 

GTTGAGCTTMCTACTTATTCATATTTGCATATGCATATTGAGATMTATCATO 

TCTCCTMGAGTMTTGTGAAAGTTTCAGA 

ATMCTTATMGCMTTGAMCTTTCM^ 

MTAGTAGATAATTTTTGTAAATGTCttGCACAGTTCTTCAW 

TCTTMTTATTAGTATTTTTCCTACTGCTCTTTGTATMTTA 

F/G. 14 
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TAACATAGCMCTGG(M^G 

GTCACACATAGAAftGAAAGAAAAAAAmTTGAAAACCT 

TTCATGTMTTTCCAGCCACTAG^ 

GTTTTGKSCCGAGGTATTTCTTTTTTra 

TCATTTCTATATACGTGCTAAAAGGTO 

MGTTTCTTTTCCTTTTCCTCACCACTTGATAm 

. GTACTAGCATTATGATGACCATACTATTTGAT(X)CCCCCAAAAAATACTTTCGAG 
TMTTATATMTTTT(&CATA(^ 
TATTTAAMGTMTTCTCTmTACMTTTTCTAAAA^ 
ATATTTATCACT^MGAtlTmTAGTATMGTACTMCmTTTM 
TATCCMCTCTMTATMT(X!CACT(X3TATTTGTTCAAMTATTO 
TTATCTTAAATGAAMTTTTT^^ 
TTTAGTTGAGAAAATMTTTTTCTC^ 
ATCMCTCTTATTTTCTTCMTACGAMTATATAAATATO 
MTCATMHTCTCACAmm^ 

TMCCTmTACTATCAMTCATAGGCMTTTCAGmATTTCATTC 

CAGGAGTCACTTCAAAAGATTCCTCCCACTGACTGAGATATTCCAAAGCCM^ 

TACTTCTTTGTACCTTC^^ 

TTTGATTmmACTACTTTATMTTTTTAMOTMGTTT^ 
ATACMTfMTTTTGAGMCT(X^^ 

AAATOGTACCAGTAGACTACATTTAOJCTGCTTMGTTAGTTCTTCTMGTACm 

TGATCGAGAACMGACA^&CTC^ 

GCCTTATCATAAAAATTATTTCXJTTTTACCATTTTGACTGrc^ 

GTACATCfTCACMCTTCTTGTTTAGGATGCAATTATATATATATATATATATATATO 

GGGTACATGGCACCACGTGCAGGTTCTTACATATGTATACATGTGCCATGTTGGTGTGCTO 

TACATTAGGTGTATCTCCTMTGCTATCCCTCCCCTCTCTCCCCACCCCACMCM 

CCTGTGTCCATGTGTTCTCATTGTTCMTTCCCACCTATGAGTGAGMCACGCAGTC 

GTOCTGA(MEAT(&TTTO^ 

TA1TCCAT(&TGTATATGTG^ 

GTTTCATGTGTA(X!ATGTATAGCACMCCMTTMGATTTCTTTCTO 

GTCTTGCCTGTCTCCMGGCTGGAGCCCMTGG1X5TGATCTTGGCTT^ 

CTCCTGCCTCAGCQTCCGAGTAGCTGGGACTATAGGCGTGCACCACCATGCCCAGCTMTTTTO 

ACGGGGTTTCACCACGGTGGCCAGGAIt3GTCTCMTTTCTTGACCTCA 

GGAHACA(X5TGT(MCCACCM^ 

MGAmGACATnCATATGTGCGTAGAGTTGTC03MGAMTGAGAGTCm 

MTAAA(^CAAAATAGTCCTAT(X;AGTTTGATTTAAATATATTCTTM 

AAACATGTAGATAT(X3ATCTTCATTAGW 

TATOGTGCTTTGTGTATCCTMGCACTATGCTMCACTGTACCAGTATTACCTG^ 

FIG. 14(cont'd-1) 
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ATTTCACTTTTCATATGAAAAMTTGM^ 

ATTTGMTTGAGGCACTCTGATCCAGAGMGCTTTTGTTTCCATGM^CTTATC 

GTACCTCAGTTGTATAMTMGAGGTTGGGTTGGTAGATGATTCTGGCTGATTCAG 

TATCACACAGTTTTCATMCAGTTMGMM 

CTGCCMGTTGCAGCAA(M3MCAGCACMTTTGCTTO 

ACTTGACTTACACTGCCACTGAttTCA(^^ 

AAAGCAGATGTTTCTGCTGTGAMTAGATACCTMTACAGMCCTGATTCCTCAT^ 

TGTAGTGTGGCTAGAGTTTCTGTTTCTCCTTGGTCCAGGCAGMTTTATC 

MGMTATTCATMGGTATTAGATTGCCATMt^TTGAACAAATCMCATTCMC 

TTCTTTTGGGATACCTCTGCAGCAGTTCAAATCTTATTTC 

CACTGACTGCTTTGATCCTATCTTCTAM 

MTTAGTMTTATGMTTGAGATGGTGTTATACAGTACACTAAQ 

GAGATATTAMTGATATTTCTCATCCTTTAGACATATACATTTTTTAGCTTACAGC^ 

TCAGGATCTGCTCCTACCAGGGTCTGAACATTTCCTCCCAGTTO 

AGGMGTTCM^TCTTTTATAGTATTGTTTAMCAGTACAGCTGA^AM 

CACTTAGTCTAGATTTACMTAAACTC^ 

GTGGCCCMGTCATCACTGAGMGTAGTACMGCACCGA(X3GMTGACTTCMCAGGM 

CCTAGCAGGMGCTCCACMGMGATAGCATGM 

TATCTCCGTCM(X5TGCA(MG(^GAGATCATTGMTGTAGCATTTTCATGCAAA^ 

TTCGGGAGTCTGTCCAMCTGCAGGTCACTCAGCCTACAGTTGQSArcMTTTC 

CTTTCTGCTATGCTGTMTATC^ 

TCTCT^GTTTA(1GAGTAGCTCCTAATACCCCTTGCTGTCTGK 

TCACACCTGTGATTCATCTCTCTACATGCAGrcTGTGTGMTCTn 

AAAAACTAAAGCATTGMGGAACTCCTTGT^^ 

fTCMTTTTAGCTTTATATTATO 

CMTTCCCATCTTAAMTGGACCAT^ 

GCGMGTAGAMCMCTGTTCATAGATCTTCATO^ 

TTATCMCCMGTTCCATAAATCATC^ 

MCACAGAKCCAGTTCAGTTAAMTACT^^ 

AKAGMCATI7ACTTCTCTCTTTATTCCAGAGCATCAATO 

GGTCTCTTATGGCCTGCCAATTTTCACAGTGGGTTCCAACGCTTTGG^ 

GGTOAATACGCTMCMTMGACAGAATMTGTGATTATTTCACCO 

MCATTCTTGA(OTI^AAMTCTC^ 

ATMTAMCCTGGGGCCACmOSCCTCATTAATAAAMCCTMT^TATMCM^ 

(X)ACAAATCTGTMGACTAAMTATTTCTCACCCCAGCAGGCTT(M 

GMCTAAMCAGCTCCTGGMGAGGACTATCACATCAKAGGTTG(^AGTCTCCAra 

GACTAGMGTGTGAMTCTATTAGTCTTCGATATGM 

AG(n!TACTCCTAGGGCAG(lAAMGTGGCMC 

FIG. 14(cont'd-2) 
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TCTTTTCATATATTTCTCMTGCCGTATGTTO 

(X3AGAAMCAMGGT(X)CTTTGCCMTGTTATGTTTCTTTO 

GG&TQGGTCATTCATTTACGTTGTGT^^ 

TAGGCACG1^GGACTM(XXjTAAAGCATTCTTCMGAATCAGW 

CCCTTGTTGCAMTATTGGTTATATOATTAAATTTACAOT 

CAAMGCAAAAM(£AAMGCCCGACCTATGATTC^ 

G(£ATAMTGATMT(&MTATTCCA^ 

ATTCTAAMTTCAGCTGGACATTGTC^ 

TTAGTTMTTAGCTGTKTGATCMTTCAAAAAttTTACra 

tatatctacacatacmttatagattgtttttcattatmtgtctc 

gaaaactgagagttttmttcatmttacgttc^^ 

tttatgtacatm(£agtcttmgcto 

tctaaagtttttatgtttttcaattc^ 

agmtatgtgmttgtoccatacatag^ 

gatoamgatcactagattagtagag^cmgacttogtccctmtctacccttmtag^ 

GTCAGTGMCCCATCTCATTCTCCTCATACTTTTTTCATCTCTAAMTGAGGGTATMm 
TTTTTTTGA(^TAGAGTTTTGC^ 

rcCTCGGTTCAAGTGATTCTCCCTGCTTCAGCCTCCCMGTGAGCCCGGGAra 

TAGATTTTTTGTATTTTCACCATGTOCCAGGCTGGTCTCG^ 
- MGTGCT(^GATTACAGGTGTGAGCCACCACGCCCA(X)CCMTATCAGTTTTTCTTTTTTM 
. AAAATACTAGCTA($M5AAAAAAAAMTM$^ 

CCAMTAMCAGTMGMTCMTCCTTTTCATATMTCCTTTCTTTGCAGMTACATAAMTTC 

CTTCCTTTTTATGATATGTO^ 

TCTTGATGATKTTTGQTAA^ 

TA(&GA(£ATTATTMCAMGMCAG^ 

CAGMCTCTACMCCCCMCATAMCT(&ATAGM 

MTCAGMGMTAGAGCTATAGCAATCTTCATTCTATAGTMCATTMGAGCCTGGTTT 

ATTTAAAAATTTACATCTTGCCGTTCHC^ 

ACTGCCTTTTATMTGCGATTAMTGCA^^ 

TTMTM(£CA(&TGCTGTACGACGTGTC^ 

GACATGTGA«3AMCCMmGTTGATAMCAGTAGAGTTAAAMmCTCTO 

CAGACATCTCTGCTACCAAAAGCTATCATATCTAGACTCGA 

FIG. 14(cont'd-3) 
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TESTCODE OF: vslnuc ck: 6724, 1 to: 1588 
WINDOW: 200 bp MARCH 14, 1999 20:25 




FIG. 16A 



TESTCODE OF: humoctosk.gb_pr2 ck: 9544, 1 to: 1374 
WINDOW: 200 bp MARCH 14, 1999 20:23 
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SEQUENCE LISTING 

<110> Srikantan, Vasantha 
Zou, Zhiqiang 
Moui, Juad w. 
Srivastava, Shiv 

<120> PROSTATE- SPECIFIC GENE, PCGEM1 , AND METHODS OF USING 
PCGEM1 TO DETECT, TREAT, AND PREVENT PROSTATE CANCER 

<130> 4995.0053-003-04 

<140> 
<141> 

<150> 60/126,469 
<151> 1999-03-26 

<160> 22 

<170> Patent In Ver. 2.1 

<210> 1 

<211> 1603 

<212> DNA 

<213> Homo sapiens 

<400> 1 

aaggcactct ggcacccagt tttggaactg cagttttaaa agtcataaat tgaatgaaaa 60 
tgatagcaaa ggtggaggtt tttaaagagc tatttatagg tccctggaca gcatcttttt 120 
tcaattaggc agcaaccttt ttgccctatg ccgtaacctg tgtctgcaac ttcctctaat 180 
tgggaaatag ttaagcagat tcatagagct gaatgataaa attgtactac gagatgcact 240 
gggactcaac gtgaccttat caagtgagca ggcttggtgc atttgacact tcatgatatc 300 
atccaaagtg gaactaaaaa cagctcctgg aagaggacta tgacatcatc aggttgggag 360 
tctccaggga cagcggaccc tttggaaaag gactagaaag tgtgaaatct attagtcttc 420 
gatatgaaat tctctgtctc tgtaaaagca tttcatattt acaagacaca ggcctactcc 480 
tagggcagca aaaagtggca acaggcaagc agagggaaaa gagatcatga ggcatttcag 540 
agtgcactgt cttttcatat atttctcaat gccgtatgtt tggttttatt ttggccaagc 600 
ataacaatct gctcaagaaa aaaaaatctg gagaaaacaa aggtgccttt gccaatgtta 660 
tgtttctttt tgacaagccc tgagatttct gaggggaatt cacataaatg ggatcaggtc 720 
attcatttac gttgtgtgca aatatgattt aaagatacaa cctttgcaga gagcatgctt 780 
tcctaagggt aggcacgtgg aggactaagg gtaaagcatt cttcaagatc agttaatcaa 840 
gaaaggtgct ctttgcattc tgaaatgccc ttgttgcaaa tattggttat attgattaaa 900 
tttacactta atggaaacaa cctttaactt acagatgaac aaacccacaa aagcaaaaaa 960 
tcaaaagccc tacctatgat ttcatatttt ctgtgtaact ggattaaagg attcctgctt 1020 
gcttttgggc ataaatgata atggaatatt tccaggtatt gtttaaaatg agggcccatc 1080 
tacaaattct tagcaatact ttggataatt ctaaaattca gctggacatt gtctaattgt 1140 
tttttatata catctttgct agaatttcaa attttaagta tgtgaattta gttaattagc 1200 

l 
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tgtgctgatc aattcaaaaa cattactttc 

caacaaatat atctacacat acaatcatag 

acagaattgt ctttgtgatt gtttttagaa 

tcaaaaaatt gtgggaacaa tccagcatta 

gagtcttaag cttggtgcct tgaagtcitt 

tttatatcta aagcatttat gtttttcaat 

ataacaaata ttaaagattt cgaaatagaa 



ctaaacttta gactatgaag gtcataaatt 1260 
attgtttttc attataatgt cttcatctta 1320 
aactgagagt tttaattcat aattacttga 1380 
attgratgtg attgttttta tgtacataag 1440 
tgtacLtagt cccatgttta aaattactac 1500 
tcaacttaca tgatgctaat tatggcaatt 1560 
aaaaaaaaaa aaa 1603 



<210> 2 

<211> 1579 

<212> DNA 

<213> Homo sapiens 

<400> 2 

gcggccgcgt cgacgcaact tcctctaatt 
aatgataaaa ttgtacttcg agatgcactg 
gagtcttgcc ctgtctccaa ggctggagcc 
ccacctccca ggttcaaacg tttctcctgc 
aggcttggtg catttgacac ttcatgatat 
gaagaggact atgacatcat caggttggga 
ggactagaaa gtgtgaaatc tattagtctt 
atttcatatt tacaagacac aggcctactc 
cagagggaaa agagatcatg aggcatttca 
tgccgtatgt ttggttttat tttggccaag 
ggagaaaaca aaggtgcctt tgccaatgtt 
tgaggggaat tcacataaat gggatcaggt 
taaagataca acctttgcag agagcatgct 
ggtaaagcat tcttcaagat cagttaatca 
cttgttgcaa atattggtta tattgattaa 
tacagatgaa caaaccccac aaaagcaaaa 
ttctgtgtaa ctggattaaa ggattcctgc 
tttccaggta ttgtttaaaa tgagggccca 
ttctaaaatt cagctggaca ttgtctaatt 
aaattttaag tatgtgaatt tagttaatta 
tcctaaattt tagactatga aggtcataaa 
agattgtttt tcattataat gtcttcatct 
aaaactgaga gttttaattc ataattactt 
taattgtatg tgattgtttt tatgtacata 
tttgtactta gtcccatgtt taaaattact 
attcaattta catgatgcta attatggcaa 
aaaaaaaaaa aaaaatcta 



gggaaatagt taagcagatt catagagctg 60 
ggactcaacg tgaccttatc aagtgagatg 120 
caatggtgtg atcttggctc actgcaacct 180 
ctcagcctcc caagtaactg ggattacagc 240 
cagccaaagt ggaactaaaa acagctcctg 300 
gtctccaggg acagcggacc ctttggaaaa 360 
cgatatgaaa ttctctgtct ccgtaaaagc 420 
ctagggcagc aaaaagtggc aacaggcaag 480 
gagtgcactg tcttttcata tatttctcaa 540 
cataacaatc tgctcaaaaa aaaaaaatct 600 
atgtttcttt ttgacaagcc ctgagatttc 660 
cattcattta cgttgtgtgc aaatatgatt 720 
ttcctaaggg taggcacgtg gaggactaag 780 
agaaaggtgc tctttgcatt ctgaaatgcc 840 
atttacactc aatggaaaca acctttaact 900 
aatcaaaagc cctacctatg atttcatatt 960 
ttgcttttgg gcataaatga taatggaata 1020 
tctacaaatt cttagcaata ctttggataa 1080 
gttttttata tacatctttg ctagaatttc 1140 
gctgtgctga tcaattcaaa aacattactt 1200 
ttcaacaaat atatctacac atacaattat 1260 
taacagaatt gtctttgtga ttgtttttag 1320 
gatcaaaaaa ttgtgggaac aatccagcat 1380 
aggagtctta agcttggtgc cttgaagtct 1440 
actttatatc taaagcattt atgtttttca 1500 
ttataacaaa tattaaagat ttcgaaatag 1560 

1579 



<210> 3 

<211> 1819 

<212> DNA 

<213> Homo sapiens 

2 
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<400> 3 

tccctcttgc gttctgcaat ttctgaaaaa 
tggaaaggta ctaattccaa tttgattcta 
aagcatttgt gtttttttta gtagtatgga 
atgaatgaaa acatgcatat gtttccatgt 
ttctaatgaa tatataaata taaagcataa 
atcatctaat ttaaaaacta aaacattaac 
ccgacttgtg tgtttttttc ttttgctttt 
agattatata gctttccttg ttttaagctt 
gtgctttgct ttttttactt aacattatgg 
ataatttgtt tatttttaat ctctttgaca 
gactcctttg tctatagata ctctgctatt 
ttttaaattt cattttgtat acttttttga 
aaataagaaa aaattgctgg gttataagat 
caaatcattt ttcaaagtaa ttatacctat 
cacatagttg cttgttctgc caaagtttgg 
atggcataaa ataaaatctc agtgtgcttt 
ttctttttta accatttata atttactttt 
ccattttttc ctattggatt gcttttctca 
tactaattat tatattactg aaaatacctt 
gtcttgtgat ggataaaagt tttaaattgt 
ataatcagca tctttaataa tctctttmta 
tacatctcta taatttctta tttttttggc 
tagtaataaa ttgcagttat ttatgaaaca 
taaaaattga tcttagcatg cttcactatg 
aaattatgta taaaacagtg gttctgaaaa 
aggttttttt gcttttgttc ttgttttttc 
agcacaaagc tatcagcttc atgtattaag 
tgcctggtgc ccaatagatg tcactctgtt 
ctcctgttaa ttcatggtag tgccccaagg 
tttaaaagtc ataaattgaa tgaaaatgat 
tataggtccc tggacagca 



aagatgttta ttgcaaagtg atatgagcac 60 
attggatgag tgacatgggt aagcgattct 120 
atttaattag ttctcagtat gttagtgaag 180 
attataaata ttttaaaatg caaaaaatta 240 
caataataat acaataccac ccataaagtc 300 
acttgaatct cccccattgc aacatctttc 360 
aaaatttttg ttttatcata tgtctgcata 420 
tttaaataat atattgtagt tatattattt 480 
ttctaaaatt cagtaatgtg ttgggcatgt 54 0 
ttcgactata taaatttcag tttgtttatt 600 
tctgtttttg ctgttacaaa aataatgctg 660 
ggcatgtgta tgagttattc taaggtaaaa 720 
tgtcacatgc tcgaatttac aagataatgc 780 
ttatactacc ggtatgagta tattggtgcc 840 
tatgatcgaa caataatttt tgcccatcaa 900 
taatttgcat tttctatgtt taagaattgt 960 
gctgaaatgc ttgcttatta tttttgctcc 1020 
ttaatttata agaattttat atggtttaga 1080 
tatcagtttg ttgtgtactt tctactttat 1140 
attgtgttga agttaacatt tttaaatttt 1200 
aaattttcct ttacatagat gtcataaaga 1260 
atatgttcat taagtcattt tatcattttt 1320 
aataattttt aaaattatat atgctttctt 1380 
aagcttgagg cttcactgca cgttgtactg 1440 
tctctgagtt catgacacct ttagtgtctc 1500 
tcacaaagca cctaagttaa ataaaaacaa 1560 
tagtaagctc ccatgttaac agttgtaact 1620 
ttcctagaaa ctttaaaata tccctcagtg 1680 
cactctggca cccagttttg gaactgcagt 1740 
agcaaaggtg gaggttttta aagagctatt 1800 

1819 



<210> 4 

<211> 1025 

<212> DNA 

<213> Homo sapiens 

<400> 4 

ttttttcaat taggcagcaa cctttttgcc 
ctaattggga aatagttaag cagattcata 
gcactgggac tcaacgtgac cttatcaagt 
atatcatcca aagtggaact aaaaacagct 
gggagtctcc agggacagcg gaccctttgg 
tcttcgatat gaaattctct gtctctgtaa 
actcctaggg cagcaaaaag tggcaacagg 



ctatgccgta acctgtgtct gcaacttcct 60 

gagctgaatg ataaaattgt actacgagat 120 

gagcaggctt ggtgcatttg acacttcatg 180 

cctggaagag gactatgaca tcatcaggtt 240 

aaaaggacta gaaagtgtga aatctattag 300 

aagcatttca tatttacaag acacaggcct 360 

caagcagagg gaaaagagat catgaggcat 420 
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ttcagagtgc actgtctttt catatatttc tcaatgccgt atgtttggtt ttattttggc 480 
caagcataac aatctgctca agaaaaaaaa atctggagaa aacaaaggtg cctttgccaa 540 
tgttatgttt ctttttgaca agccctgaga tttctgaggg gaattcacat aaatgggatc 600 
aggtcattca tttacgttgt gtgcaaatat gatttaaaga tacaaccttt gcagagagca 660 
tgctttccta agggtaggca cgtggaggac taagggtaaa gcattcttca agatcagtta 720 
atcaagaaag gtgctctttg cattctgaaa tgcccttgtt gcaaatattg gttatattga 780 
ttaaatttac acttaatgga aacaaccttt aacttacaga tgaacaaacc cacaaaagca 840 
aaaaatcaaa agccctacct atgatttcat attttctgtg taactggatt aaaggattcc 900 
tgcttgcttt tgggcataaa tgataatgga atatttccag gtattgttta aaatgagggc 960 
ccatctacaa attcttagca atactttgga taattctaaa attcagctgg acattgtcta 1020 
attgt 1025 



<210> 5 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe /Primer 



<210> 6 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 6 

ggccaaaata aaaccaaaca t 21 

<210> 7 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe /Primer 



<400> 



5 



tgcctcagcc tcccaagtaa c 



21 



<400> 7 



tggcaacagg caagcagag 



19 



4 
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<2io> e 

<211> 11801 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> unsure 
<222> (7470) 

<223> Y may represent any of the 
<400> 8 

tccctcttgc gttctgcaat ttctgaaaaa 
tggaaaggta ctaattccaa tttgattcta 
aagcatttgt gtttttttta gtagtatgga 
atgaatgaaa acatgcatat gtttccatgt 
ttctaatgaa tatataaata taaagcataa 
atcatctaat ttaaaaacta aaacattaac 
ccgacttgtg tgtttttttc ttttgctttt 
agattatata gctttccttg ttttaagctt 
gtgctttgct ttttttactt aacattatgg 
ataatttgtt tatttttaat ctctttgaca 
gactcctttg tctatagata ctctgctatt 
ttttaaattt -cattttgtat acttttttga 
aaataagaaa aaattgctgg gttataagat 
caaatcattt ttcaaagtaa ttatacctat 
cacatagttg cttgttctgc caaagtttgg 
atggcataaa ataaaatctc agtgtgcttt 
ttctttttta accatttata atttactttt 
ccattttttc ctattggatt gcttttctca 
tactaattat tatattactg aaaatacctt 
gtcttgtgat ggataaaagt tttaaattgt 
ataatcagca tctttaataa tctctttata 
tacatctcta taatttctta tttttttggc 
tagtaataaa ttgcagttat ttatgaaaca 
taaaaattga tcttagcatg cttcactatg 
aaattatgta taaaacagtg gttctgaaaa 
aggttttttt gcttttgttc ttgttttttc 
agcacaaagc tatcagcttc atgtattaag 
tgcctggtgc ccaatagatg tcactctgtt 
ctcctgttaa ttcatggtag tgccccaagg 
tttaaaagtc ataaattgaa tgaaaatgat 
tataggtccc tggacagcat cttttttcaa 
actgtgtctg cacttcctct aattggggtg 
taagaatata gtaataatgg cttaaatcat 
acaaaataaa aatgctttga aaagtataga 
aatgatttga tagggctact cagttttgta 
gcatggatgt tgtgatccag cctttatatg 



four nucleotide bases 



aagatgttta ttgcaaagtg atatgagcac 60 
attggatgag tgacatgggt aagcgattct 120 
atttaattag ttctcagtat gttagtgaag 180 
attataaata ttttaaaatg caaaaaatta 240 
caataataat acaataccac ccataaagtc 300 
acttgaatct cccccattgc aacatctttc 360 
aaaatttttg ttttatcata tgtctgcata 420 
tttaaataat atattgtagt tatattattt 480 
ttctaaaatt cagtaatgtg ttgggcatgt 540 
ttcgactata taaatttcag tttgtttatt 600 
tctgtttttg ctgttacaaa aataatgctg 660 
ggcatgtgta tgagttattc taaggtaaaa 720 
tgtcacatgc tcgaatttac aagataatgc 780 
ttatactacc ggtatgagta tattggtgcc 840 
tatgatcgaa caataatttt tgcccatcaa 900 
taatttgcat tttctatgtt taagaattgt 960 
gctgaaatgc ttgcttatta tttttgctcc 1020 
ttaatttata agaattttat atggtttaga 1080 
tatcagtttg ttgtgtactt tctactttat 1140 
attgtgttga agttaacatt tttaaatttt 1200 
aaattttcct ttacatagat gtcataaaga 1260 
atatgttcat taagtcattt tatcattttt 1320 
aataattttt aaaattatat atgctttctt 1380 
aagcttgagg cttcactgca cgttgtactg 1440 
tctctgagtt catgacacct ttagtgtctc 1500 „ 
tcacaaagca cctaagttaa ataaaaacaa 1560 
tagtaagctc ccatgttaac agttgtaact 1620 
ttcctagaaa ctttaaaata tccctcagtg 1680 
cactctggca cccagttttg gaactgcagt 1740 
agcaaaggtg gaggttttta aagagctatt 1800 
ttaggcagca acctttttgc ctatgccgta I860 
agtaagagat tttgttatgt atataatagc 1920 
ggttattttt aaactactaa catttagaag 1980 
ggttttagtg taattagcag ggaataatga 2040 
taactttggt gctttaagtc tgaatgcaga 2100 
ttttccctga agaagattta atttatttgg 2160 
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ccttttgaga aacacatttg gcattgtaat atgttttgct tccaggttct atctccaagg 2220 
ataatttgac aaaatcacac ataaatttat tttcagggca cacagtttcc cttttaggga 2280 
actcacagag gtagagagta atacaataat cacatttgaa tattcagtaa gtgaggtcct 2340 
catagatctt atgtgtatgt caccatgcat ataattttgt taatcactag atgtatgaga 2400 
caagaaattt gaggaatctt aactagagat taaaatcagg gatttaaatc aaagaaacat 2460 
ttaaatgcct cctttattat ttaaatacct gcatgggaga atcattgaaa aaaaaataaa 2520 
aagcatacaa cttgggaata ttataaacca agaagaattt gttattctgg ttgatttttt 2580 
tttcaggctc cgcacaggca acttaccttt atctctttgt gatttttatt tcttgttaaa 2640 
atatacagaa atagttaagc agattcatag agctgaatat aaaatttact acgagatgca 2700 
ctgggactca acgtgacctt atcaagtgac ttatcagtga ggtgagcatt cttaattcag 2760 
ataatggaac ttattatcat aatcttttgc ttatgctatt gttgagctta actacttatt 2820 
catatttgca tatgcatatt gagataatat catttcatta atttcagtac tgaacactaa 2880 
tctcctaaga gtaattgtga aagtttcaga ttgcactatt tttaactata tatctgtatg 2940 
ttatcttcat atatgcttga ataacttata agcaattgaa actttcaatt acagtatact 3000 
attgaagcaa atcaactaat atatacacat atccattagc aatagtagat aatttttgta 3060 
aatgtccagc acagttcttc atatgtagag gatgttcaaa ttggctaagt tccttttctc 3120 
tcttaattat tagtattttt cctactgctc tttgtataat tattccttcc tctttagctc 3180 
caatccttac aatctattct taacatagca actgggaaga aagtttttaa acataaacca 3240 
gatgatgtca ctccacccca caaaacttcc actattctct gtcacacata gaaagaaaga 3300 
aaaaaaatat tgaaaaccta caaagacttg ctatgatctg gtccaggctc tccctaaaat 3360 
ttcatgtaat ttccagccac taggcctttc tggctctcct tcaatctcat tagccttttc 3420 
actactacaa gttagactgg gttttggccg aggtatttct ttttttcata ttttgccttt 3480 
gcctagattg ctcttccaat agatattcac aattgcatca tcatttctat atacgtgcta 3540 
aaaggtttcc ttgtccaaaa tagcttcagt gaccacctga tctagaatag tctcgatcaa 3600 
aagtttcttt tccttttcct caccacttga tatttatatc aaacatttat ttgtgtaatt 3660 
tatgtgtttg tttgttttct gtactagcat tatgatgacc atactatttg atgcccccca 3720 
aaaaatactt tcgagaatga cagggcaaag ctaaaataat taaattatat aattttgaca 3780 
taggcactat tgacaaaaag caattgatgt tatgatagtg ttagatctat gaaatagtac 3840 
tatttaaaag taattctctg aaatacaatt ttctaaaact aaaagcagca tatgtacatg 3900 
aaacaccaaa aaacttcctt atatttatca ctggaagatt taaaatagta taagtagtaa 3960 
cttatttaat atatttttga ttatttaatt aattttatag tatccaactc taatataatg 4020 
ccagtggtat ttgttcaaaa tattttaatg ttgtctattt atttttaatt tgcctaaaaa 4080 
ttatcttaaa tgaaaatttt tggttaataa atttgaaaat actgaaaccc tcatctccag 4140 
tctctgtgga tcctaaagtt tttagttgag aaaataattt ttctctagag aatgaagtag 4200 
cttgtaagct tggagaaatt tctgctaaat aaatgatatt atcaactctt attttcttca 4260 
atacgaaata tataaatatt tcagctcata tatttttgca ggtgctatgc ttttgcttcc 4320 
aatcataatt tctgacaaat attttggaag tcaaaacttg tcttctattt tgttatttaa 4380 
aattatatag actacttttg taaaccttta tactatcaaa tcataggcaa tttcagtttg 4440 
atttcattct ggtgcagaat ataagtttat ccaagtaaaa caggagtcac ttcaaaagat 4500 
tcctcccact gactgagata ttccaaagcc aactttgcaa aatttcagaa ttaaatatta 4560 
tacttctttg taccttcatt ttatttgttc aatttttctt tgtgtttgta gaaaatttta 4620 
atatttttct gttttcaagt tttgatttta atttactact ttataatttt taaaggtaag 4680 
ttttgtgagg ctatattcat tatgtgtttt gaataaagac atacaattaa ttttgagaac 4740 
tgcaataaaa attataagac tattaaaaat gcagtaagtg tactacactt aggctgctaa 4800 
aaatgcagta ccagtagact acatttaggc tgcttaaagt tagttcttct aagtaccata 4860 
tactttaaaa ttttagctaa tgatggagaa caaagacaga aagactgtgt taccatattc 4920 
tagttggcca ttttgttttg ttttgagaga cgtcacatca gccttatcat aaaaattatt 4980 
tggttttacc attttgactg tgagcaaaat atacagcata atatacaaaa taaaatatat 5040 
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gtacatcttc acaacttctt gtttaggatg caattatata tatatatata tatatattta 5100 
ttattatact ttaagttcta gggtacatgg caccacgtgc aggttgttac atatgtatac 5160 
atgtgccatg ttggtgtgct gcacccatta actcgtcatt tacattaggt gtatctccta 5220 
atgctatccc tcccctctct ccccacccca caacaagccc cggtgtgcga tgttcccctt 5280 
cctgtgtcca tgtgttctca ttgttcaatt cccacctatg agtgagaaca cgcagtgttt 5340 
gcttttttgt ccttgcaata gtttgctgag aatgatggtt tccagcttca tccatgtccc 5400 
tacaaaggac atgaactcat cattttttat ggctgcatag tattccatgg tgtatatgtg 5460 
ccaccatttt cttaatccga gtctgtccat tgttgttgga catttgggtt gcaattttga 5520 
gtttcatgtg tagcatgtat agcacaacca attaagattt ctttctttct ctcttttttt 5580 
tttttttttg ttgaaatgga gtcttgcctg tctccaaggc tggagcccaa tggtgtgatc 5640 
ttggcttact gcaacctcca cctcccgggt tcaagcgatt ctcctgcctc agccatccga 5700 
gtagctggga ctataggcgt gcaccaccat gcccagctaa tttttgtatt tttagtacag 5760 
acggggtttc accacggtgg ccaggatggt ctcaatttct tgacctcatg attcacccgc 5820 
cttggcctcc caaagtgctg ggattacagg tgtgaaccac caagcccggc ctgtcacaag 5880 
tttttagtgt tctattttaa tacagaaatt agataaatcc aaagagaaag acatttcata 5940 
tgtgcgtaga gttgtcggaa gaaatgagag tcttataaat aactttaaaa attgtgaaga 6000 
aataaaggca aaatagtcct atgcagtttg atttaaatat attcttaata agagctactt 6060 
ttgtgaaaac cagaatattg aaacatgtag atatggatct tcattagtga ctgacataat 6120 
atattgttat tgttactatt ttattgtatc agccaactaa tattgagtgc tttgtgtatc 6180 
ctaagcacta tgctaaacac tgtaccagta ttacctgata taatcatatt aatatttatt 6240 
atttcacttt tcatatgaaa aaattgaagc acagattaag acactccgaa atcatacctc 6300 
tattgattat cagcaccagg atttgaattg aggcactctg atccagagaa gcttttgttt 6360 
ccatgaaggc ttatgttggg gaaaaataat caaattgcct gtacctcagt tgtataaata 6420 
agaggttggg ttggtagatg attctggctg attcagcaga aaagaaattt attcaaagga 6480 
tatcacacag ttttcataac agttaagaat acagaggaaa cagggcacca gggctaagta 6540 
cagaccaaag tccaaaacca ctgccaaagt tgcagcaagg agaacagcac aaatttgctt 6600 
gctgtcaccc gccactagat gcttttgttt ggagccttga acttgactta cactgccact 6660 
gacatcagca ccagtgctct ctgtgtacta ggaggtggag ttggtgacgt tgctgaactc 6720 
aaagcagatg tttctgctgt gaaatagata cctaatacag aacctgcttc ctcattcatt 6780 
ccctccccaa atcatatgct tgtagtgtgg ctagagtttc tgtttctcct tggtccaggc 6840 
agaatttatg aagcttgcta tttatcgcct taaagattag aagaatattc ataaggtatt 6900 
agattgccat aaggttgaac aaatcaacat tcaacttcaa ggattcaaca ttgttttgtt 6960 
ttcttttggg atacctctgc agcagttcaa atcttatttc tgcccttgga caaccaggtt 7020 
tataaatatt gcagattctc cactgactgc tttgatccta tcttctatat ttatgtatac 7080 
taattagcat ataataaaag attatgttac agaatctcaa aattagtaat tatgaattga 7140 
gatggtgtta tacagtacac taacatccaa gagacttgtt tattccaagg aaaatattta 7200 
gagatattaa atgatatttc tcatccttta gacatataca ttttttagct tacagcctgc 7260 
tttaggcaag caacagactc tcaggatctg ctcctaccag ggtctgaaca tttcctccca 7320 
gttttaaaga aacaaattca aataacattg taacctccag aggaaagttc aagctctttt 7380 
atagtattgt ttaaacagta cagctgagga aactaaagac agagaagtta aatgccttgg 7440 
cacttagtct agatttacaa taaactccty tctacttagg acccactaac aggggctgca 7S0O 
tttacaccaa aaccatgaag gtggcccaag tcatcactga gaagtagtac aagcaccgag 7560 
ggaatgactt caacaggaac aagaaagcgt ggaaggagat cctagcagga agctccacaa 7620 
gaagatagca tgttacgtct tgcattggat gaagcaggtt cagagagacc tagtgacagc 7680 
tatctccgtc aaggtgcaga aggagagatc attgaatgta gcattttcat gcaaaaaaaa 7740 
aaatgttgaa gtctttggac ttcgggagtc tgtccaaact gcaggtcact cagcctacag 7800 
ttgggatgaa tttcaaaaca ccagttggag ccggttgaat ctttctgcta tgctgtaata 7860 
ttttcagtaa acccagcgca acaacaacaa caaaacacaa aaggaggaga agcagccaag 7920 
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tctcttggtt tacagagtag ctcctaatac 
gaagatagtc aaaacaatat tcacacctgt 
atctttatat actgcatatt aaggatctgt 
aactccttgt tttgacttat caaagtcctt 
ttcaaatttt agctttatat tatcacttga 
attcactgat aacctacaga caattcccat 
aataaaattg agggttttcc ttacatgttt 
catagatctt cattgaggat tcgcatgtga 
ttatcaacca agttccataa atcatgaaca 
ccaccacatc tcttgtaata aacacagagc 
cggttcaggg cctgctgagt ggcactcagt 
ctttattcca gagcatcaat ggccaaggct 
ggtctcttat ggcctgccaa ttttcacagt 
agacctgtta gaaaaatgtc ggttggaata 
atttcacctc atttttatag gacttgagta 
aatctgaatg ttaggacacc aaatatctcc 
ataataaacc tggggccact gcaggcctca 
gaggaggaaa tgccaatgcc gcacaaatct 
ggcttggtgc atttgacact tcatgatatc 
aagaggacta tgacatcatc aggttgggag 
gactagaaag tgtgaaatct attagtcttc 
atttcatatt tacaagacac aggcctactc 
cagagggaaa agagatcatg aggcatttca 
tgccgtatgt ttggttttat tttggccaag 
ggagaaaaca aaggtgcctt tgccaatgtt 
tgaggggaat tcacataaat gggatcaggt 
taaagataca acctttgcag agagcatgct 
ggtaaagcat tcttcaagaa tcagttaatc 
cccttgttgc aaatattggt tatattgatt 
cttacagatg aacaaaccca caaaagcaaa 
tttctgtgta actggattaa aggattcctg 
atttccaggt attgtttaaa atgagggccc 
attctaaaat tcagctggac attgtctaat 
caaattttaa gtatgtgaat ttagttaatt 
ttcctaaatt ttagactatg aaggtcataa 
tagattgttt ttcattataa tgtcttcatc 
gaaaactgag agttttaatt cataattacg 
attaattgta tgtgattgtt tttatgtaca 
cttttgtact tagtcccatg tttaaaatta 
caattcaatt tacatgatgc taattatggc 
agaatatgtg aattgttcac catacataga 
ctgggtgaaa gagtgctttt gattgaaaga 
gtccctaatc tacccttaat agccatgtgg 
ctcctcatac ttttttcatc tctaaaatga 
tttttttgag atagagtttt gctcttgtca 
gctcactgca accctctgct tcctcggttc 
tgagcccggg attacaggtg cccgccacca 
catgttggcc aggctggtct cgaaccccta 
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ctaatcctgc 
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taacaataat 
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aaatatttct 


caccccagca 
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cagctcctgg 
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tctccaggga 


cagcggaccc 


tttggaaaag 
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tctctgtctc 
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tatttctcaa 
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aaaaaaatct 
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atgtttcttt 
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ctgagatttc 


9420 


cattcattta 


cgttgtgtgc 
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9540 
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ttctgaaatg 
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9660 
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9720 


cttgcttttg 
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9780 
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tcttagcaat 


actttggata 


9840 


tgttttttat 


atacatcttt 


gctagaattt 


9900 


agctgtgctg 


atcaattcaa 


aaacattact 


9960 


attcaacaaa 


tatatctaca 


catacaatta 


10020 


ttaacagaat 


tgtctttgtg 


attgttttta 


10080 


ttgatcaaaa 


aattgtggga 


acaatccagc 


10140 


taaggagtct 


taagcttggt 


gccttgaagt 


10200 


ctactttata 


tctaaagcat 


ttatgttttt 


10260 
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10320 
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aagtcctggg actacaggcg tgagccacca cgcccagccc aacatcagtt tttctttttt 10860 
aacacaaggc taacacaatc aaaatactag ctaggggaga aaaaaaaaat aaggcactgt 10920 
ttatgcgtaa caggctcttg ttgcaatcca ctggggcaga ccaaataaac agtaagaatc 10980 
aaatcctttt catataatcc tttctttgca gaatacataa aatccccaca aatggcttat 11040 
cttccttttt atgatatgtt ggagaattgt agctaagtga cagatatttt gcttgggtgt 11100 
atagaccaca aaggactgtg tcttgatgat ggtttgcaca aaattatacc ttagttttta 11160 
ctttgtatgt tacatgttag atttagagta tgaaaattag tagggaggat tattaacaaa 11220 
gaacagggca agaggagtag aattaaacct cttctaatac ctgtgcacaa gtaggctttt 11280 
cagaaactcc acaaccccaa cataaactgg atagttagaa aagcacactc ccaaggaagg 113 40 
cggttatgtt ttgcagtttg aatcagaaga atagagctat agcaatcttc attctatagt 11400 
aacattaaag agcctggttt atattatagc agtcattaag atttaaaaat ttacatcttg 11460 
ccgttcttct cactcacaga ttttcgagag gtaatgtaat gatcacacga ggtgagaatc 11520 
actgcctttt ataatgcgat taaatgcatg aacaaagttt ccaacaaata acagtaataa 11580 
aaagaaacat gtattagcac ttaataagcc aggtgctgta cgacgtgtgt tacatgcttt 11640 
caatccatga actggtaaac tggtactagt atctctattg gacatgtgag gaaaccaaat 11700 
ggagttgata aacagtagag ttaaaaatta ctcttcatat attatattgc ctcaatctca 11760 
cagacatctc tgctaccaaa agctatcata tctagactcg a 11801 



<210> 9 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 9 

tggcaacagg caagcagag 



<210> 10 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 10 

ggccaaaata aaaccaaaca t 
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<210> 11 

<211> 24 

<212> DNA 

<213> Artificial 
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<220> 

<223> Description of Artificial Sequence : Probe /Primer 
<400> 11 

gcaaatatga tttaaagata caac 



<210> 12 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Probe /Primer 
<400> 12 

ggttgtatct ttaaatcata tttgc 



<210> 13 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Probe /Primer 
<400> 13 

actgtctttt catatatttc tcaatgc 



<210> 14 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 14 

aagtagtaat tttaaacatg ggac 



<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
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<22Q> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 15 

tttttcaatt aggcagcaac c 2i 

<210> 16 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 16 

gaattgtctt tgtgattgtt tttag 25 

<210> 17 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe /Primer 
<400> 17 

caattcacaa agacaattca gttaag 26 

<210> 18 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe /Primer 
<400> 18 

acaattagac aatgtccagc tga 23 

<210> 19 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Probe /Primer 
<400> 19 

ctttggctga tatcatgaag tgtc 24 



<210> 20 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 20 

aaccttttgc cctatgccgt aac 23 



<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

<220> . ' 

<223> Description of Artificial Sequence: Probe/ Primer 

<400> 21 

gagactccca acctgatgat gt 22 



<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Probe/Primer 
<400> 22 

ggtcacgttg agtcccagtg 20 
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